Bob Stanley, CEO, IO Informatics, explains the utility to RDF as a standard way of defining and redefining data that could have utility in managing life science information.
RDF is very simple. It provides a framework for describing things according to their relationships. A key value proposition for semantic technologies is to make it easy to describe and connect related data sources, in order to search them as deeply as we care to. Although there is much useful effort in the formal terminology space, the broad focus is on practical outcomes – specifically, to provide a standard set of methods and framework for describing and linking data. This is NOT a standard for what the minimum or perfect data definition should be! – it is a standard way of defining and redefining data…
Lowering barriers to interoperability using standards does NOT mean using standardized, approved terms in an ontology or API. It means using a common framework for resource description, in which terms can easily be surfaced, visualized, tested, merged – due to the common approach.
SPARQL is an open standard Query Service API. It supports a query language with a variety of visual query interfaces and takes advantage of growing linked open data resources . SPARQL APIs can utilize SOAP and REST.
Agile resource description is a key benefit of RDF and SPARQL. Although it can support highly formal resource description, RDF and SPARQL lowers barrier to interoperability caused by excessive concern about perfect specification of reference content. RDF opens up this conversation providing standard framework and practices for mixing, mapping and updating resource descriptions via data model providers AND consumers. RDF also can manage provenance and links to original data, full experiment records, etc. --------- SPARQL endpoints provide visible data models that can be visually managed by data providers. They can also be rapidly adapted by local users mapping to more or less well-formed APIs. Depending on user need, these can be narrowly practical application ontologies or formal OWL models. Application ontologies can be rapidly mapped to OWL ontologies and vice versa.
Broadly, semantic technologies provide a standard framework and methods for describing and querying data resources and connections between data elements. The technology space has been slowing growing and maturing over the past decade. People were writing papers and doing work in the area in the late ‘90s. Tim Berners-Lee wrote an article in 2001 that was published in Scientific American and began popularizing the concepts. These methods make it possible to visualize and connect data in a way that makes sense both to humans and computers. Using RDF and SPARQL, data descriptions can be modified by end users without refactoring the database or databases.
RDF is very simple. It provides a framework for describing things according to their relationships. A key value proposition for semantic technologies is to make it easy to describe and connect related data sources, in order to search them as deeply as we care to. Although there is much useful effort in the formal terminology space, the broad focus is on practical outcomes – specifically, to provide a standard set of methods and framework for describing and linking data. This is NOT a standard for what the minimum or perfect data definition should be! – it is a standard way of defining and redefining data…
Clockwise from 6 oclock > Species, (protein description), Protein, Gene, Gene Location Terminology around new methods can cause confusion. For ease, I find it useful at times to refer to “assertions” instead of “triples”, and to “data structure” instead “ontologies”.
These methods have been developed by W3C, MIT, Stanford, Dublin Core, DERI and elsewhere for just about a decade. Vendor and consumer adoption and technical maturity has grown over the past 8 years or so, with exponential growth over the past 4 years. This compares very favorably to “proprietary” methods and standards invented by single vendors.
Agile method; human readable ontologies, no more over-specification, easy to manage and update, end users can extend the API
First we linked all of the data of interest to help the customer understand what happens biologically in organ transplant rejection scenarios. Next we apply SPARQL to run searches across clinical, protein and gene expression databases, to screen for patients at risk.
Not a new data standard, a standard way of publishing and linking data Extensible so you can start and get to a ‘good enough’ state quickly Created with federation in mind Out-of-box benefits for project completion, maintenance and extension to new data sources