Fostering Synergies - How Semantic Web Technology could influence Software Repositories

•

5 recomendaciones•362 vistas

Talk given at SUITE 2010 Abstract: The state-of-the-art in mining software repositories stores software artifacts from various sources into monolithic relational databases. This puts a lot of querying power in the hands of the software miners, however it comes at the cost of enclosing the data and hamper cross-application reuse. In this paper we discuss four problem scenarios to illustrate that Semantic Web technology is able to overcome these limitations. However, it requires that the software engineering research community agrees on two prerequisites: (a) a common vocabulary to talk about software repositories -- an ontology; (b) a strategy for generating unique and stable references to all software artifacts inside such a repository -- a Universal Resource Identifier (URI).

Tecnología

Fostering
Synergies
How Semantic Web Technology could
inﬂuence Software Repositories
Michael Würsch, Gerald Reif, Serge Demeyer, Harald
Gall

University of Zurich, Switzerland

University of Zurich
Department of Informatics software evolution & architecture lab

Developer’s Information
Needs
‣ Who has changed this code and why?
‣ How can I persist data in Spring?
‣ What are the subclasses of
JComponent?
‣ What class in my project had the most
bugs prior to the last release?
‣ ...

Information Silos
Bugzilla

Mailinglists
CVS
Atlassian Jira

Subversion

Wikis
‣ Limited search
‣ No uniﬁed data model capabilities
‣ No references across silo ‣ No cross-domain
boundaries queries

Leveraging Information: State of the
Art

www.google.com/codesearch
Bugzilla

preprocess
www.koders.com

mirror
CVS

sourcerer.ics.uci.edu

e
ee
e
www.evolizer.org

Again, Silos...

‣ Database schemas
are not portable
www.google.com/codesearch sourcerer.ics.uci.edu ‣ Relations are local
‣ There is no
consistent way of

e
ee getting the meaning
e of a relation
www.evolizer.org
www.koders.com

Release your Data!

‣ Use a common vocabulary to
describe software artifacts and their
relationships
‣ Expose unique identiﬁers for
software artifacts on the web

The Semantic Web/The Web of
Data

‣ Graph-based data model
described by S-P-O triples
‣ URIs to reference Resources
‣ Ontologies to formalize a
common understanding of a
domain
‣ SPARQL to search by
matching graph-patterns

Example: Building an RDF
Graph

http://myProject.org/bugs/nr124
http://evolizer.org/bugOntology/affects
http://sourcerer.ics.uci.edu/myProject/Foo.java

http://sourcerer.ics.uci.edu/
myProject/Foo.java http://myProject.org/bugs/nr124

Research Agenda

Come up with a strategy for generating
URIs for software artifacts
Develop an ontology of software
artifacts and their relationships

Existing Ontologies

EvoOnt
http://www.iﬁ.uzh.ch/ddis/evo/

SEON - Software Engineering Ontology
http://evolizer.org

Baetle - Bug And Enhancement Tracking
Language
http://code.google.com/p/baetle/

DOAP - Description of a Project
http://trac.usefulinc.com/doap

Release your Data! The Semantic Web/The Web of Data

‣ Graph-based data model
described by S-P-O triples
‣ Formalize a common vocabulary to
describe software artifacts and their ‣ URIs to reference Resources
relationships ‣ Ontologies to formalize a
‣ Devise strategies to generate URIs for
common understanding of a
software artifacts domain

‣ Expose these URIs on the Web ‣ SPARQL to search by matching
graph-patterns

Existing Ontologies Research Agenda

EvoOnt Come up with a strategy for generating
http://www.iﬁ.uzh.ch/ddis/evo/
URIs for software artifacts
SEON - Software Engineering Ontology
http://evolizer.org Develop an ontology of software artifacts
Baetle - Bug And Enhancement Tracking Language and their relationships
http://code.google.com/p/baetle/

DOAP - Description of a Project
http://trac.usefulinc.com/doap

Más contenido relacionado

Similar a Fostering Synergies - How Semantic Web Technology could influence Software Repositories

Un unbis-agrovoc 2010-09-03Johannes Keizer

Role of Semantic Web in Health InformaticsArtificial Intelligence Institute at UofSC

Web Technology Trends (early 2009)Prodosh Banerjee

20120411 travelalliancemcguinnessfinalDeborah McGuinness

Walter apiNicholas Schiller

Semtech2006Adrian Walker

Webinar: Semantic web for developersSemantic Web Company

Bisp sales force-course-curriculumnAmit Sharma

Research Shared: researchobject.orgNorman Morrison

W3 C Intro And Beyond - Eyal SelaIsraeli Internet Association technology committee

Linked data for Enterprise Data IntegrationSören Auer

What is New in W3C land?Ivan Herman

WebGUI And The Semantic WebWilliam McKee

Semantic Representation of Provenance in WikipediaFabrizio Orlandi

Linked Data and Semantic Web Application Development by Peter HaaseLaboratory of Information Science and Semantic Technologies

2009 CTSA Profiles OpenSocial Posterericmeeks

Intro to Machine Learning with H2O and AWSSri Ambati

Software Architecture Erosion and Modernizationbmerkle

Why I don't use Semantic Web technologies anymore, event if they still influe...Gautier Poupeau

The Semantic Web: What IAs Need to Know About Web 3.0Chiara Fox Ogan

Similar a Fostering Synergies - How Semantic Web Technology could influence Software Repositories (20)

Un unbis-agrovoc 2010-09-03

Role of Semantic Web in Health Informatics

Web Technology Trends (early 2009)

20120411 travelalliancemcguinnessfinal

Walter api

Semtech2006

Webinar: Semantic web for developers

Bisp sales force-course-curriculumn

Research Shared: researchobject.org

W3 C Intro And Beyond - Eyal Sela

Linked data for Enterprise Data Integration

What is New in W3C land?

WebGUI And The Semantic Web

Semantic Representation of Provenance in Wikipedia

Linked Data and Semantic Web Application Development by Peter Haase

2009 CTSA Profiles OpenSocial Poster

Intro to Machine Learning with H2O and AWS

Software Architecture Erosion and Modernization

Why I don't use Semantic Web technologies anymore, event if they still influe...

The Semantic Web: What IAs Need to Know About Web 3.0

Último

WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays

CNIC Information System with Pakdata Cf In Pakistandanishmna97

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays

Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services

Architecting Cloud Native ApplicationsWSO2

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays

[BuildWithAI] Introduction to Gemini.pdfSandro Moreira

Platformless Horizons for Digital AdaptabilityWSO2

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

Corporate and higher education May webinar.pptxRustici Software

ICT role in 21st century education and its challengesrafiqahmad00786416

Fostering Synergies - How Semantic Web Technology could influence Software Repositories

1. Fostering Synergies How Semantic Web Technology could inﬂuence Software Repositories Michael Würsch, Gerald Reif, Serge Demeyer, Harald Gall University of Zurich, Switzerland University of Zurich Department of Informatics software evolution & architecture lab

2. Developer’s Information Needs ‣ Who has changed this code and why? ‣ How can I persist data in Spring? ‣ What are the subclasses of JComponent? ‣ What class in my project had the most bugs prior to the last release? ‣ ...

3. Information Silos Bugzilla Mailinglists CVS Atlassian Jira Subversion Wikis ‣ Limited search ‣ No uniﬁed data model capabilities ‣ No references across silo ‣ No cross-domain boundaries queries

4. Leveraging Information: State of the Art www.google.com/codesearch Bugzilla preprocess www.koders.com mirror CVS sourcerer.ics.uci.edu e ee e www.evolizer.org

5. Again, Silos... ‣ Database schemas are not portable www.google.com/codesearch sourcerer.ics.uci.edu ‣ Relations are local ‣ There is no consistent way of e ee getting the meaning e of a relation www.evolizer.org www.koders.com

6. Release your Data! ‣ Use a common vocabulary to describe software artifacts and their relationships ‣ Expose unique identiﬁers for software artifacts on the web

7. The Semantic Web/The Web of Data ‣ Graph-based data model described by S-P-O triples ‣ URIs to reference Resources ‣ Ontologies to formalize a common understanding of a domain ‣ SPARQL to search by matching graph-patterns

8. Example: Building an RDF Graph http://myProject.org/bugs/nr124 http://evolizer.org/bugOntology/affects http://sourcerer.ics.uci.edu/myProject/Foo.java http://sourcerer.ics.uci.edu/ myProject/Foo.java http://myProject.org/bugs/nr124

9. Research Agenda Come up with a strategy for generating URIs for software artifacts Develop an ontology of software artifacts and their relationships

10. Existing Ontologies EvoOnt http://www.iﬁ.uzh.ch/ddis/evo/ SEON - Software Engineering Ontology http://evolizer.org Baetle - Bug And Enhancement Tracking Language http://code.google.com/p/baetle/ DOAP - Description of a Project http://trac.usefulinc.com/doap

11. Release your Data! The Semantic Web/The Web of Data ‣ Graph-based data model described by S-P-O triples ‣ Formalize a common vocabulary to describe software artifacts and their ‣ URIs to reference Resources relationships ‣ Ontologies to formalize a ‣ Devise strategies to generate URIs for common understanding of a software artifacts domain ‣ Expose these URIs on the Web ‣ SPARQL to search by matching graph-patterns Existing Ontologies Research Agenda EvoOnt Come up with a strategy for generating http://www.iﬁ.uzh.ch/ddis/evo/ URIs for software artifacts SEON - Software Engineering Ontology http://evolizer.org Develop an ontology of software artifacts Baetle - Bug And Enhancement Tracking Language and their relationships http://code.google.com/p/baetle/ DOAP - Description of a Project http://trac.usefulinc.com/doap

Notas del editor

Search-Driven Software Engineering is all about fulfilling information needs of developers or maintainers. These information needs can be expressed in terms of questions, such as...(Read some of the questions above)
The data needed to answer such questions is often locked away in data silos, such as Bug Tracking Systems, Version Control Systems, Mailing lists, etc. I say locked away, because many of these tools are not made for querying. Further, many information needs span more than one domain. This is where limitations of the existing systems are apparent. To summarize them (read examples).
(continue) we usually parse, for example, CVS logs or XML exports of bug reports and use some heuristics to establish links between them. Or we build richer source code models by parsing or partially compiling source files. Then we more or less mirror all the information in a relational database and provide a query interface on top of it. Examples are...(name the examples)
From the point of view of other researchers and tool builders, we are again building silos that are barely useful for other than the originally envisioned purposes. There are three main reasons for that: First, in theory, db schemas should be exchangeable thanks to DDLs, in practice is is still a painful undertaking Second, relations are local - you can not simply reference an entity stored within another database in your database. you basically enclose your data in the db. Third, there is no consistent way to get the meaning of a relation - a query can join tables by any columns which match by data type, without any check on semantics
We believe that we should overcome this limitations by defining a common data schema, meaning a common syntax and vocabulary to describe software artifacts and the relationships between them. This would, for example, give us the possibility to try out different search-engines on different data-repositories. We should further come up with ways to expose unique identifiers for software artifacts on the web. This enables two things: first, we can reference information across these silo boundaries and second, we could then potentially run distributed queries, without all the preprocessing and mirroring effort, I have mentioned before.
We believe that the Semantic Web provides the tools for this. Forget about all the A.I. magic that you might associate with the Semantic Web. It is just a very convenient, but yet simple, framework for describing and working with data. It provides a graph-based data model, described by simple subject-predicate-object models and URIs to reference resources. Vocabulary is described by ontologies. You can search in such information graphs with SPARQL, the query language of the Semantic Web.
Given two repositories, one that stores bug reports and one that stores a full-fledged source code model, we can then, in a third place make statements about a particular bug and a particular Java class. This is a s-p-o triple. Dereferencing the URIs leads to the resources, or, in the case of the &#x2018;affects&#x2019;-property to the ontology definition.
We need unique and stable identifiers for s-e artifacts. It&#x2019;s easy to come up with such URIs for some artifacts, but not so straight-forward for others. We need agree on a common vocabulary (data schema) for software engineering concepts. A source code visualization tool should not need to care whether it works with data retrieved from Evolizer, Google Code Search, Koders, or Sourcerer. These tasks are clearly a community effort
No need to start from scratch - take existing ontologies and consolidate.

Fostering Synergies - How Semantic Web Technology could influence Software Repositories

Recomendados

Recomendados

Más contenido relacionado

Similar a Fostering Synergies - How Semantic Web Technology could influence Software Repositories

Similar a Fostering Synergies - How Semantic Web Technology could influence Software Repositories (20)

Último

Último (20)

Fostering Synergies - How Semantic Web Technology could influence Software Repositories

Notas del editor