SlideShare una empresa de Scribd logo
1 de 42
Descargar para leer sin conexión
Hooking up Semantic MediaWiki with
    external tools via SPARQL
    The Use case behind the RDFIO extension

         RDFIO Extension: Google Summer of Code project in 2010
              Mentor: Denny Vrandecic <http://simia.net/>

  SMW + Bioclipse integration: MSc Thesis project in Bioclipse group, Inst. of
        Pharmaceutical Biosciences, Uppsala University, Sweden.
    Supervisor: Egon Willighagen <http://chem-bla-ics.blogspot.com>
    Samuel Lampa <http://saml.rilspace.org>, SMWCon 23 Sep 2011
This talk

● The motivation and use case behind RDFIO

● Looking for feedback on design decisions

● Outlook and Plans

● Possible collaborations / interactions?
The Problem

● Need to combine:
   ○ Manual schema exploration, as knowledge grows
   ○ Automated data generation, for integrating data from
     literature or own experimental or computational results
   ○ Community collaboratiom

● Thus need something that supports all in same framework
Solution: Bioclipse + SMW




      Bioclipse as automated or interactive batch writer
Bioclipse Bioinformatics Workbench
Bioclipse Shortfacts

● Open Source
● Based on Eclipse - Modular, Solid
● Cheminformatics/Bioinformatics Tools
   ○ Jmol, JChemPaint, etc
● Scripting (JavaScript)
● Semantic Web Support
● Core team at Uppsala University
● http://bioclipse.net
SMW Limitations
● No (prior) RDF import

● No obvious way to choose wiki titles for URIs
  (Problem 1)

● RDF Export format not same as import format
  ... is hard-coded, using URIs such as:
 http://example.com/wiki/Special:URIResolver-3ASomeEntity


 (Problem 2)
So, needed a new extension:
          RDFIO
RDFIO Extension Short facts
 ● Google Summer of Code project 2010
    ○ Mentor: Denny Vrandecic
 ● Dependencies:
    ○ Semantic MediaWiki
    ○ SMWWriter (Denny Vrandecic)
    ○ POM (Sergey Chernychev)
    ○ ARC2 RDF Lib (Benjamin Nowack)
 ● Current version: 0.5
    ○ Needs bugfixing
    ○ Needs making SMW 1.6 compatible

 ● http://www.mediawiki.org/wiki/Extension:RDFIO
RDFIO Data Flow
Problem 1: Choosing wiki
page titles for RDF entities
Example of Experimental Data
<http://[...]/nmrshiftdb/?moleculeId=234>
    dc:title "warburganal";
    chem:casnumber "62994-47-2";
    nmr:moleculeId "234";
    nmr:hasSpectrum <http://[...]/nmrshiftdb/?spectrumId=4735>;
<http://[...]/nmrshiftdb/?spectrumId=4735> nmr:field "50";
    nmr:hasPeak <http://[...]/nmrshiftdb/?s4735p0>,
        <http://[...]/nmrshiftdb/?s4735p1>,
        <http://[...]/nmrshiftdb/?s4735p2>,
        <http://[...]/nmrshiftdb/?s4735p3>;
    nmr:solvent "Chloroform-D1 (CDCl3)";
    nmr:spectrumId "4735";
    nmr:spectrumType "13C";
    nmr:temperature "298".
<http://[...]/nmrshiftdb/?s4735p1>
    nmr:hasShift 18.3;
    a nmr:peak.
<http://[...]/nmrshiftdb/?s4735p2>
    nmr:hasShift 22.6;
    a nmr:peak.
<http://[...]/nmrshiftdb/?s4735p3>
    nmr:hasShift 26.5;
    a nmr:peak.
Problem: URIs not nice as titles
So, how to choose the wiki
     page title then?
Let's look what we have in the
 RDF, that might be used ...
Wiki page title for the Molecule?
    <http://[...]/nmrshiftdb/?moleculeId=234>
        dc:title "warburganal";
        chem:casnumber "62994-47-2";
        nmr:moleculeId "234";
        nmr:hasSpectrum <http://[...]spectrumId=4735>;
Wiki page title for the Molecule?
    <http://[...]/nmrshiftdb/?moleculeId=234>
        dc:title "warburganal";
        chem:casnumber "62994-47-2";
        nmr:moleculeId "234";
        nmr:hasSpectrum <http://[...]spectrumId=4735>;
Wiki page title for the Spectrum?
    <http://[...]nmrshiftdb/?spectrumId=4735>
        nmr:field "50";
        nmr:hasPeak <http://[...]nmrshiftdb/?s4735p0>,
            <http://[...]nmrshiftdb/?s4735p1>,
            <http://[...]nmrshiftdb/?s4735p2>,
            <http://[...]nmrshiftdb/?s4735p3>;
        nmr:solvent "Chloroform-D1 (CDCl3)";
        nmr:spectrumId "4735";
        nmr:spectrumType "13C";
        nmr:temperature "298".
Wiki page title for the Spectrum?
    <http://[...]nmrshiftdb/?spectrumId=4735>
        nmr:field "50";
        nmr:hasPeak <http://[...]nmrshiftdb/?s4735p0>,
            <http://[...]nmrshiftdb/?s4735p1>,
            <http://[...]nmrshiftdb/?s4735p2>,
            <http://[...]nmrshiftdb/?s4735p3>;
        nmr:solvent "Chloroform-D1 (CDCl3)";
        nmr:spectrumId "4735";
        nmr:spectrumType "13C";
        nmr:temperature "298".
Wiki page title for the Spectrum?
    <http://pele.farmbio.uu.se/nmrshiftdb/?s4735p1>
        nmr:hasShift 18.3;
        a nmr:peak.
Wiki page title for the Spectrum?
    <http://pele.farmbio.uu.se/nmrshiftdb/?s4735p1>
        nmr:hasShift 18.3;
        a nmr:peak.
So ...
 ● So, we might find useful titles in:
    ○ Dublin Core fields (small vocabulary for cataloging)
    ○ Custom identifiers
    ○ (parts of) URIs

 ● So, we need a flexible configuration system such as:
    ○ A system-wide defaults in LocalSettings.php, AND
    ○ Interactive configuration in RDF Import form

      (See following slides ...)
In LocalSettings.php ...
  /**
   * Used to generate "Pseudo-NameSpaces"
   */
  $rdfiogBaseURIs = array(
      "http://example.org/someOntology#" => "ont1",
      "http://example.org/anotherOntology#" => "ont2"
  );

  /**
   * "Properties" to be used as Wiki Title
   */
  $rdfiogPropertiesToUseAsWikiTitle = array(
    'http://semantic-mediawiki.org/swivt/1.0#page',
    'http://www.w3.org/2000/01/rdf-schema#label',
    'http://purl.org/dc/elements/1.1/title',
    'http://www.w3.org/2004/02/skos/core#preferredLabel',
    'http://xmlns.com/foaf/0.1/name',
    'http://www.nmrshiftdb.org/onto#spectrumId'
  );
... and in RDF Import Screen (1/2)
... and in RDF Import Screen (2/2)
Result: Problem 1 solved!
 Page get sensible titles
  (See screenshot on next slide)
Problem 2: Export RDF in the
   format it was imported
  (... not the SMW hard-coded format)
Solution: Storing Original URIs
Result: Problem 2 solved!
Can Query/Export in original RDF format
So now we have solved the problems:

● Choosing Wiki Titles
● Export in same format as import


        ... so now we can go ahead and
      hook up SMW with an external tool ....
Talking to SMW from Bioclipse scripts
var wikiURL = "http://drugmet.rilspace.org/wiki/";

/* Editing SMW facts */

smw.addTriple("w:Caffeine", "w:is_a", "w:Molecule", wikiURL);
smw.removeTriple("w:Caffeine", "w:is_a", "w:Molecule", wikiURL);

/* Downloading RDF for local querying */

rdfStore = rdf.createInMemoryStore();
rdfStore = smw.getRDF( wikiURL );
result = rdf.sparql( rdfStore,
    "SELECT DISTINCT ?pred WHERE { ?subj ?pred ?obj } LIMIT 10"
)

/* Show some output */

js.print( result );
Talking to SMW from Bioclipse (contd.)
...
js.print(
    result
);

----------------------------------------------------------------
[["p"],
["http://www.w3.org/1999/02/22-rdf-syntax-ns#type"],
["http://www.w3.org/2000/01/rdf-schema#domain"],
["http://www.w3.org/2000/01/rdf-schema#range"],
["http://www.w3.org/2000/01/rdf-schema#subPropertyOf"],
["http://www.w3.org/2000/01/rdf-schema#subClassOf"],
["http://semantic-mediawiki.org/swivt/1.0#wikiPageModificationDate"], ["http:
//semantic-mediawiki.org/swivt/1.0#wikiNamespace"], ["http://www.w3.
org/2000/01/rdf-schema#isDefinedBy"],
["http://semantic-mediawiki.org/swivt/1.0#page"],
["http://www.w3.org/2000/01/rdf-schema#label"]
]
So ... the use case is now possible ...
Some real world usage already
Real world use 1: Visualization of Data
{{#ask: [[Category:Measurements]] [[Has Study::{{PAGENAME}}]] [[Has
Endpoint::PercentageNonViableCells]]
| ?Has Endpoint Value
| format=jqplotbar
| sort=Has Endpoint Value
| order=ascending
| height=250
| width=600
}}




      http://chem-bla-ics.blogspot.com/2011/06/plotting-rdf-data-with-semantic-media.html
Real world usage 2: Pulling data from R

library(rrdf)

endpoint = "http://127.0.0.1/mediawiki/index.php/Special:
SPARQLEndpoint"

query = paste("PREFIX w: ",
"SELECT ?min ?max ?zeta WHERE ",
"{ ?inst a w:Category-3AMetalOxides . ",
" OPTIONAL { ?inst w:Property-3AHas_Size_Min ?min . }", "
OPTIONAL { ?inst w:Property-3AHas_Size_Max ?max . }", "
OPTIONAL { ?inst w:Property-3AHas_Zeta_potential ?zeta .
}",
"}"
);



    http://chem-bla-ics.blogspot.com/2011/06/importing-nanotoxicity-data-with-sparql.html
Real world usage 2: Pulling data from R
> data
     min max zeta
[1,] 15 90     NA
[2,] 15 90     NA
[3,] 15 90     NA
[4,] 15 90     NA
[5,] 15 90     NA
[6,] 15 90     NA
[7,] 15 90     NA
[8,] 15 90     NA
[9,] 15 90     NA
[10,] 15 90    NA
[11,] 15 90    NA
[12,] 15 90    NA
[13,] 10 100 34.2
[14,] 30 60-17.3
[15,] 20 30 1.8
[16,] 15 90    NA




      http://chem-bla-ics.blogspot.com/2011/06/importing-nanotoxicity-data-with-sparql.html
Outlook
● Proof of concept is there

● Things to fix before useful:
   ○ SMW 1.6 support
   ○ Some bugfixes

● Further wishlist:
   ○ Support more triple stores
   ○ Skip ARC2 PHP RDF library dependency?
   ○ SPARQL UPDATE instead of ARC2's SPARQL+
   ○ "Output as Original URIs" also in RDF Batch export
   ○ Improve performance
Thank you!

        RDFIO Extension: Google Summer of Code project in 2010
             Mentor: Denny Vrandecic <http://simia.net/>

Bioclipse + SMW integration: MSc Thesis project in Bioclipse group, Inst. of
         Pharmaceutical Biosciences, Uppsala University, Sweden.
    Supervisor: Egon Willighagen <http://chem-bla-ics.blogspot.com>
    Samuel Lampa <http://saml.rilspace.org>, SMWCon 23 Sep 2011

Más contenido relacionado

La actualidad más candente

Presto in my_use_case2
Presto in my_use_case2Presto in my_use_case2
Presto in my_use_case2
wyukawa
 
Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012
scorlosquet
 

La actualidad más candente (20)

Logs aggregation and analysis
Logs aggregation and analysisLogs aggregation and analysis
Logs aggregation and analysis
 
Sphinx && Perl Houston Perl Mongers - May 8th, 2014
Sphinx && Perl  Houston Perl Mongers - May 8th, 2014Sphinx && Perl  Houston Perl Mongers - May 8th, 2014
Sphinx && Perl Houston Perl Mongers - May 8th, 2014
 
LINEデリマでのElasticsearchの運用と監視の話
LINEデリマでのElasticsearchの運用と監視の話LINEデリマでのElasticsearchの運用と監視の話
LINEデリマでのElasticsearchの運用と監視の話
 
Log aggregation and analysis
Log aggregation and analysisLog aggregation and analysis
Log aggregation and analysis
 
Elastic{ON} 2016 Review - 김종민 님
Elastic{ON} 2016 Review - 김종민 님Elastic{ON} 2016 Review - 김종민 님
Elastic{ON} 2016 Review - 김종민 님
 
Workflow Engines for Hadoop
Workflow Engines for HadoopWorkflow Engines for Hadoop
Workflow Engines for Hadoop
 
ELK Wrestling (Leeds DevOps)
ELK Wrestling (Leeds DevOps)ELK Wrestling (Leeds DevOps)
ELK Wrestling (Leeds DevOps)
 
StackStormを1年間データ基盤で使ってみてぶつかったトラブルとその解決策の共有
StackStormを1年間データ基盤で使ってみてぶつかったトラブルとその解決策の共有StackStormを1年間データ基盤で使ってみてぶつかったトラブルとその解決策の共有
StackStormを1年間データ基盤で使ってみてぶつかったトラブルとその解決策の共有
 
Presto in my_use_case2
Presto in my_use_case2Presto in my_use_case2
Presto in my_use_case2
 
Workflow on Hadoop Using Oozie__HadoopSummit2010
Workflow on Hadoop Using Oozie__HadoopSummit2010Workflow on Hadoop Using Oozie__HadoopSummit2010
Workflow on Hadoop Using Oozie__HadoopSummit2010
 
Log analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and KibanaLog analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and Kibana
 
Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 
Logging for OpenStack - Elasticsearch, Fluentd, Logstash, Kibana
Logging for OpenStack - Elasticsearch, Fluentd, Logstash, KibanaLogging for OpenStack - Elasticsearch, Fluentd, Logstash, Kibana
Logging for OpenStack - Elasticsearch, Fluentd, Logstash, Kibana
 
Logstash family introduction
Logstash family introductionLogstash family introduction
Logstash family introduction
 
Why Your MongoDB Needs Redis
Why Your MongoDB Needs RedisWhy Your MongoDB Needs Redis
Why Your MongoDB Needs Redis
 
Git journey from mars to neon EclipseCon North America - 2016-03-08
Git journey from mars to neon   EclipseCon North America - 2016-03-08Git journey from mars to neon   EclipseCon North America - 2016-03-08
Git journey from mars to neon EclipseCon North America - 2016-03-08
 
Apache Spark Introduction - CloudxLab
Apache Spark Introduction - CloudxLabApache Spark Introduction - CloudxLab
Apache Spark Introduction - CloudxLab
 
Monitoring with ElasticSearch
Monitoring with ElasticSearch Monitoring with ElasticSearch
Monitoring with ElasticSearch
 
Productionizing Spark and the REST Job Server- Evan Chan
Productionizing Spark and the REST Job Server- Evan ChanProductionizing Spark and the REST Job Server- Evan Chan
Productionizing Spark and the REST Job Server- Evan Chan
 

Destacado

Flow based programming an overview
Flow based programming   an overviewFlow based programming   an overview
Flow based programming an overview
Samuel Lampa
 

Destacado (13)

2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
 
SciPipe - A light-weight workflow library inspired by flow-based programming
SciPipe - A light-weight workflow library inspired by flow-based programmingSciPipe - A light-weight workflow library inspired by flow-based programming
SciPipe - A light-weight workflow library inspired by flow-based programming
 
iRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat SheetiRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat Sheet
 
The RDFIO Extension - A Status update
The RDFIO Extension - A Status updateThe RDFIO Extension - A Status update
The RDFIO Extension - A Status update
 
Continuous modeling - automating model building on high-performance e-Infrast...
Continuous modeling - automating model building on high-performance e-Infrast...Continuous modeling - automating model building on high-performance e-Infrast...
Continuous modeling - automating model building on high-performance e-Infrast...
 
Python Generators - Talk at PySthlm meetup #15
Python Generators - Talk at PySthlm meetup #15Python Generators - Talk at PySthlm meetup #15
Python Generators - Talk at PySthlm meetup #15
 
Thesis presentation Samuel Lampa
Thesis presentation Samuel LampaThesis presentation Samuel Lampa
Thesis presentation Samuel Lampa
 
Agile large-scale machine-learning pipelines in drug discovery
Agile large-scale machine-learning pipelines in drug discoveryAgile large-scale machine-learning pipelines in drug discovery
Agile large-scale machine-learning pipelines in drug discovery
 
Flow based programming an overview
Flow based programming   an overviewFlow based programming   an overview
Flow based programming an overview
 
Vagrant, Ansible and Docker - How they fit together for productive flexible d...
Vagrant, Ansible and Docker - How they fit together for productive flexible d...Vagrant, Ansible and Docker - How they fit together for productive flexible d...
Vagrant, Ansible and Docker - How they fit together for productive flexible d...
 
AddisDev Meetup ii: Golang and Flow-based Programming
AddisDev Meetup ii: Golang and Flow-based ProgrammingAddisDev Meetup ii: Golang and Flow-based Programming
AddisDev Meetup ii: Golang and Flow-based Programming
 
Reproducibility in Scientific Data Analysis - BioScience Seminar
Reproducibility in Scientific Data Analysis - BioScience SeminarReproducibility in Scientific Data Analysis - BioScience Seminar
Reproducibility in Scientific Data Analysis - BioScience Seminar
 
Introduction to the drug discovery process
Introduction to the drug discovery processIntroduction to the drug discovery process
Introduction to the drug discovery process
 

Similar a Hooking up Semantic MediaWiki with external tools via SPARQL

Drupal and the semantic web - SemTechBiz 2012
Drupal and the semantic web - SemTechBiz 2012Drupal and the semantic web - SemTechBiz 2012
Drupal and the semantic web - SemTechBiz 2012
scorlosquet
 
Java(ee) mongo db applications in the cloud
Java(ee) mongo db applications in the cloud Java(ee) mongo db applications in the cloud
Java(ee) mongo db applications in the cloud
Shekhar Gulati
 

Similar a Hooking up Semantic MediaWiki with external tools via SPARQL (20)

Nzitf Velociraptor Workshop
Nzitf Velociraptor WorkshopNzitf Velociraptor Workshop
Nzitf Velociraptor Workshop
 
Publishing Linked Data from RDB
Publishing Linked Data from RDBPublishing Linked Data from RDB
Publishing Linked Data from RDB
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
 
Semantic web and Drupal: an introduction
Semantic web and Drupal: an introductionSemantic web and Drupal: an introduction
Semantic web and Drupal: an introduction
 
Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)
 
Drupal and the semantic web - SemTechBiz 2012
Drupal and the semantic web - SemTechBiz 2012Drupal and the semantic web - SemTechBiz 2012
Drupal and the semantic web - SemTechBiz 2012
 
Drupal and the Semantic Web
Drupal and the Semantic WebDrupal and the Semantic Web
Drupal and the Semantic Web
 
Doctrine Project
Doctrine ProjectDoctrine Project
Doctrine Project
 
Drupal 7 and RDF
Drupal 7 and RDFDrupal 7 and RDF
Drupal 7 and RDF
 
Bio2RDF@BH2010
Bio2RDF@BH2010Bio2RDF@BH2010
Bio2RDF@BH2010
 
NoSQL Introduction
NoSQL IntroductionNoSQL Introduction
NoSQL Introduction
 
NoSQL Introduction
NoSQL IntroductionNoSQL Introduction
NoSQL Introduction
 
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012
 
BloodHound Unleashed.pdf
BloodHound Unleashed.pdfBloodHound Unleashed.pdf
BloodHound Unleashed.pdf
 
Ubuntu scope development
Ubuntu scope developmentUbuntu scope development
Ubuntu scope development
 
Querying Linked Data with SPARQL
Querying Linked Data with SPARQLQuerying Linked Data with SPARQL
Querying Linked Data with SPARQL
 
Drupal as a Semantic Web platform - ISWC 2012
Drupal as a Semantic Web platform - ISWC 2012Drupal as a Semantic Web platform - ISWC 2012
Drupal as a Semantic Web platform - ISWC 2012
 
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OOVirtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO
Virtuoso RDF Triple Store Analysis Benchmark & mapping tools RDF / OO
 
Java(ee) mongo db applications in the cloud
Java(ee) mongo db applications in the cloud Java(ee) mongo db applications in the cloud
Java(ee) mongo db applications in the cloud
 

Más de Samuel Lampa

Profiling go code a beginners tutorial
Profiling go code   a beginners tutorialProfiling go code   a beginners tutorial
Profiling go code a beginners tutorial
Samuel Lampa
 
My lightning talk at Go Stockholm meetup Aug 6th 2013
My lightning talk at Go Stockholm meetup Aug 6th 2013My lightning talk at Go Stockholm meetup Aug 6th 2013
My lightning talk at Go Stockholm meetup Aug 6th 2013
Samuel Lampa
 

Más de Samuel Lampa (7)

Using Flow-based programming to write tools and workflows for Scientific Comp...
Using Flow-based programming to write tools and workflows for Scientific Comp...Using Flow-based programming to write tools and workflows for Scientific Comp...
Using Flow-based programming to write tools and workflows for Scientific Comp...
 
Linked Data for improved organization of research data
Linked Data  for improved organization  of research dataLinked Data  for improved organization  of research data
Linked Data for improved organization of research data
 
How to document computational research projects
How to document computational research projectsHow to document computational research projects
How to document computational research projects
 
First encounter with Elixir - Some random things
First encounter with Elixir - Some random thingsFirst encounter with Elixir - Some random things
First encounter with Elixir - Some random things
 
Profiling go code a beginners tutorial
Profiling go code   a beginners tutorialProfiling go code   a beginners tutorial
Profiling go code a beginners tutorial
 
My lightning talk at Go Stockholm meetup Aug 6th 2013
My lightning talk at Go Stockholm meetup Aug 6th 2013My lightning talk at Go Stockholm meetup Aug 6th 2013
My lightning talk at Go Stockholm meetup Aug 6th 2013
 
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Hooking up Semantic MediaWiki with external tools via SPARQL

  • 1. Hooking up Semantic MediaWiki with external tools via SPARQL The Use case behind the RDFIO extension RDFIO Extension: Google Summer of Code project in 2010 Mentor: Denny Vrandecic <http://simia.net/> SMW + Bioclipse integration: MSc Thesis project in Bioclipse group, Inst. of Pharmaceutical Biosciences, Uppsala University, Sweden. Supervisor: Egon Willighagen <http://chem-bla-ics.blogspot.com> Samuel Lampa <http://saml.rilspace.org>, SMWCon 23 Sep 2011
  • 2. This talk ● The motivation and use case behind RDFIO ● Looking for feedback on design decisions ● Outlook and Plans ● Possible collaborations / interactions?
  • 3. The Problem ● Need to combine: ○ Manual schema exploration, as knowledge grows ○ Automated data generation, for integrating data from literature or own experimental or computational results ○ Community collaboratiom ● Thus need something that supports all in same framework
  • 4. Solution: Bioclipse + SMW Bioclipse as automated or interactive batch writer
  • 6. Bioclipse Shortfacts ● Open Source ● Based on Eclipse - Modular, Solid ● Cheminformatics/Bioinformatics Tools ○ Jmol, JChemPaint, etc ● Scripting (JavaScript) ● Semantic Web Support ● Core team at Uppsala University ● http://bioclipse.net
  • 7. SMW Limitations ● No (prior) RDF import ● No obvious way to choose wiki titles for URIs (Problem 1) ● RDF Export format not same as import format ... is hard-coded, using URIs such as: http://example.com/wiki/Special:URIResolver-3ASomeEntity (Problem 2)
  • 8. So, needed a new extension: RDFIO
  • 9. RDFIO Extension Short facts ● Google Summer of Code project 2010 ○ Mentor: Denny Vrandecic ● Dependencies: ○ Semantic MediaWiki ○ SMWWriter (Denny Vrandecic) ○ POM (Sergey Chernychev) ○ ARC2 RDF Lib (Benjamin Nowack) ● Current version: 0.5 ○ Needs bugfixing ○ Needs making SMW 1.6 compatible ● http://www.mediawiki.org/wiki/Extension:RDFIO
  • 11. Problem 1: Choosing wiki page titles for RDF entities
  • 13. <http://[...]/nmrshiftdb/?moleculeId=234> dc:title "warburganal"; chem:casnumber "62994-47-2"; nmr:moleculeId "234"; nmr:hasSpectrum <http://[...]/nmrshiftdb/?spectrumId=4735>; <http://[...]/nmrshiftdb/?spectrumId=4735> nmr:field "50"; nmr:hasPeak <http://[...]/nmrshiftdb/?s4735p0>, <http://[...]/nmrshiftdb/?s4735p1>, <http://[...]/nmrshiftdb/?s4735p2>, <http://[...]/nmrshiftdb/?s4735p3>; nmr:solvent "Chloroform-D1 (CDCl3)"; nmr:spectrumId "4735"; nmr:spectrumType "13C"; nmr:temperature "298". <http://[...]/nmrshiftdb/?s4735p1> nmr:hasShift 18.3; a nmr:peak. <http://[...]/nmrshiftdb/?s4735p2> nmr:hasShift 22.6; a nmr:peak. <http://[...]/nmrshiftdb/?s4735p3> nmr:hasShift 26.5; a nmr:peak.
  • 14. Problem: URIs not nice as titles
  • 15. So, how to choose the wiki page title then?
  • 16. Let's look what we have in the RDF, that might be used ...
  • 17. Wiki page title for the Molecule? <http://[...]/nmrshiftdb/?moleculeId=234> dc:title "warburganal"; chem:casnumber "62994-47-2"; nmr:moleculeId "234"; nmr:hasSpectrum <http://[...]spectrumId=4735>;
  • 18. Wiki page title for the Molecule? <http://[...]/nmrshiftdb/?moleculeId=234> dc:title "warburganal"; chem:casnumber "62994-47-2"; nmr:moleculeId "234"; nmr:hasSpectrum <http://[...]spectrumId=4735>;
  • 19. Wiki page title for the Spectrum? <http://[...]nmrshiftdb/?spectrumId=4735> nmr:field "50"; nmr:hasPeak <http://[...]nmrshiftdb/?s4735p0>, <http://[...]nmrshiftdb/?s4735p1>, <http://[...]nmrshiftdb/?s4735p2>, <http://[...]nmrshiftdb/?s4735p3>; nmr:solvent "Chloroform-D1 (CDCl3)"; nmr:spectrumId "4735"; nmr:spectrumType "13C"; nmr:temperature "298".
  • 20. Wiki page title for the Spectrum? <http://[...]nmrshiftdb/?spectrumId=4735> nmr:field "50"; nmr:hasPeak <http://[...]nmrshiftdb/?s4735p0>, <http://[...]nmrshiftdb/?s4735p1>, <http://[...]nmrshiftdb/?s4735p2>, <http://[...]nmrshiftdb/?s4735p3>; nmr:solvent "Chloroform-D1 (CDCl3)"; nmr:spectrumId "4735"; nmr:spectrumType "13C"; nmr:temperature "298".
  • 21. Wiki page title for the Spectrum? <http://pele.farmbio.uu.se/nmrshiftdb/?s4735p1> nmr:hasShift 18.3; a nmr:peak.
  • 22. Wiki page title for the Spectrum? <http://pele.farmbio.uu.se/nmrshiftdb/?s4735p1> nmr:hasShift 18.3; a nmr:peak.
  • 23. So ... ● So, we might find useful titles in: ○ Dublin Core fields (small vocabulary for cataloging) ○ Custom identifiers ○ (parts of) URIs ● So, we need a flexible configuration system such as: ○ A system-wide defaults in LocalSettings.php, AND ○ Interactive configuration in RDF Import form (See following slides ...)
  • 24. In LocalSettings.php ... /** * Used to generate "Pseudo-NameSpaces" */ $rdfiogBaseURIs = array( "http://example.org/someOntology#" => "ont1", "http://example.org/anotherOntology#" => "ont2" ); /** * "Properties" to be used as Wiki Title */ $rdfiogPropertiesToUseAsWikiTitle = array( 'http://semantic-mediawiki.org/swivt/1.0#page', 'http://www.w3.org/2000/01/rdf-schema#label', 'http://purl.org/dc/elements/1.1/title', 'http://www.w3.org/2004/02/skos/core#preferredLabel', 'http://xmlns.com/foaf/0.1/name', 'http://www.nmrshiftdb.org/onto#spectrumId' );
  • 25. ... and in RDF Import Screen (1/2)
  • 26. ... and in RDF Import Screen (2/2)
  • 27. Result: Problem 1 solved! Page get sensible titles (See screenshot on next slide)
  • 28.
  • 29. Problem 2: Export RDF in the format it was imported (... not the SMW hard-coded format)
  • 31. Result: Problem 2 solved! Can Query/Export in original RDF format
  • 32.
  • 33. So now we have solved the problems: ● Choosing Wiki Titles ● Export in same format as import ... so now we can go ahead and hook up SMW with an external tool ....
  • 34. Talking to SMW from Bioclipse scripts var wikiURL = "http://drugmet.rilspace.org/wiki/"; /* Editing SMW facts */ smw.addTriple("w:Caffeine", "w:is_a", "w:Molecule", wikiURL); smw.removeTriple("w:Caffeine", "w:is_a", "w:Molecule", wikiURL); /* Downloading RDF for local querying */ rdfStore = rdf.createInMemoryStore(); rdfStore = smw.getRDF( wikiURL ); result = rdf.sparql( rdfStore, "SELECT DISTINCT ?pred WHERE { ?subj ?pred ?obj } LIMIT 10" ) /* Show some output */ js.print( result );
  • 35. Talking to SMW from Bioclipse (contd.) ... js.print( result ); ---------------------------------------------------------------- [["p"], ["http://www.w3.org/1999/02/22-rdf-syntax-ns#type"], ["http://www.w3.org/2000/01/rdf-schema#domain"], ["http://www.w3.org/2000/01/rdf-schema#range"], ["http://www.w3.org/2000/01/rdf-schema#subPropertyOf"], ["http://www.w3.org/2000/01/rdf-schema#subClassOf"], ["http://semantic-mediawiki.org/swivt/1.0#wikiPageModificationDate"], ["http: //semantic-mediawiki.org/swivt/1.0#wikiNamespace"], ["http://www.w3. org/2000/01/rdf-schema#isDefinedBy"], ["http://semantic-mediawiki.org/swivt/1.0#page"], ["http://www.w3.org/2000/01/rdf-schema#label"] ]
  • 36. So ... the use case is now possible ...
  • 37. Some real world usage already
  • 38. Real world use 1: Visualization of Data {{#ask: [[Category:Measurements]] [[Has Study::{{PAGENAME}}]] [[Has Endpoint::PercentageNonViableCells]] | ?Has Endpoint Value | format=jqplotbar | sort=Has Endpoint Value | order=ascending | height=250 | width=600 }} http://chem-bla-ics.blogspot.com/2011/06/plotting-rdf-data-with-semantic-media.html
  • 39. Real world usage 2: Pulling data from R library(rrdf) endpoint = "http://127.0.0.1/mediawiki/index.php/Special: SPARQLEndpoint" query = paste("PREFIX w: ", "SELECT ?min ?max ?zeta WHERE ", "{ ?inst a w:Category-3AMetalOxides . ", " OPTIONAL { ?inst w:Property-3AHas_Size_Min ?min . }", " OPTIONAL { ?inst w:Property-3AHas_Size_Max ?max . }", " OPTIONAL { ?inst w:Property-3AHas_Zeta_potential ?zeta . }", "}" ); http://chem-bla-ics.blogspot.com/2011/06/importing-nanotoxicity-data-with-sparql.html
  • 40. Real world usage 2: Pulling data from R > data min max zeta [1,] 15 90 NA [2,] 15 90 NA [3,] 15 90 NA [4,] 15 90 NA [5,] 15 90 NA [6,] 15 90 NA [7,] 15 90 NA [8,] 15 90 NA [9,] 15 90 NA [10,] 15 90 NA [11,] 15 90 NA [12,] 15 90 NA [13,] 10 100 34.2 [14,] 30 60-17.3 [15,] 20 30 1.8 [16,] 15 90 NA http://chem-bla-ics.blogspot.com/2011/06/importing-nanotoxicity-data-with-sparql.html
  • 41. Outlook ● Proof of concept is there ● Things to fix before useful: ○ SMW 1.6 support ○ Some bugfixes ● Further wishlist: ○ Support more triple stores ○ Skip ARC2 PHP RDF library dependency? ○ SPARQL UPDATE instead of ARC2's SPARQL+ ○ "Output as Original URIs" also in RDF Batch export ○ Improve performance
  • 42. Thank you! RDFIO Extension: Google Summer of Code project in 2010 Mentor: Denny Vrandecic <http://simia.net/> Bioclipse + SMW integration: MSc Thesis project in Bioclipse group, Inst. of Pharmaceutical Biosciences, Uppsala University, Sweden. Supervisor: Egon Willighagen <http://chem-bla-ics.blogspot.com> Samuel Lampa <http://saml.rilspace.org>, SMWCon 23 Sep 2011