SlideShare a Scribd company logo
1 of 19
Download to read offline
Integrating applications & projects
                            =
    Dynamic & repeatable transformation
   of existing Thesauri and Authority lists
                 into SKOS
                            +
 Cross-tabulation of Concepts Linked Data
   Presentation to the Linked Data Meeting
     University College of London, September 14th 2010
by Christophe Dupriez, Destin SSEB, dupriez@destin.be
        working for Belgium Poison Centre
     rue Bruyn 1, B-1120 Brussels (Belgium)
The main request from Users:
Whenever a concept is mentioned,
concise visual clues about:
 • Where it comes from? (e.g. : substances)
 • Where is it also mentioned?
  (e.g.     : in MDs Wiki+paper files)
 • For which role?
  (e.g. This substance, is it a problem or a cure? )
 • About how many times is it mentioned?
  (e.g.            =893 bibliographic records, 203 products,
  2180 calls, 65 medical reports)
+ single click to access any of those when desired.
Use Case : Current Awareness
From a list of subjects and
document types (e.g. reviews, case reports…) :
 • filter remote sources (e.g. PubMed)
 • help index new records with our vocabularies.
         We must manage equivalences
between our thesauri and remote ones (e.g. MeSH)
         (browse, display, validate, update...)
     Our Concept                  MeSH Heading
                        EQ
       & NTs...        EQ~
                                       & NTs...
                       BTM
                       NTM
Use Case: Emergencies
     words              Concepts               Data
           Feedback



1)   Powerful word search
2)   Information to discriminate between Concepts
3)   Identified Concepts = Clues to gather Linked Data
4)   Managing sets of data for different Clues
5)   Browsing Data to discriminate between Hypothesis
      Support MDs to build their recommendations;
          Analysis of Events for Toxico-Vigilance
Benefits of integrating SKOS
  (terminology and concepts management)
in all applications of users' workbench
• Multilingual data and user interface.
• Exhaustivity: Searches retrieves specifics, synonyms,
  translations and equivalent concepts in other “aligned”
  thesauri.
• Precision: precise result for a given concept
• Strongly validated updates; Data entry helped by Auto-
  complete
• Better metadata model, easier to maintain
Benefits of integrating
             Concepts' Usages information
        in all applications of users' workbench
•                               : concept's references enriched
    with statistics and links to places where they are also used.
• Promotes direct linking from a concept to its usages within
  applications.
• Promotes homogenous display and functionalities to create,
  display, update, link, unlink concepts to applications
  elements.
• Usage statistics (and search link) near each mention of a
  concept (passage from one application to another)
• Better metadata model, easier to maintain
1. BIBL application: Articles about Human Toxicology
Internal Thesauri (Subject Vocabularies):
  1) Substances
  2) Living beings (plants, animals, mushrooms...)
  3) Symptoms, Treatments
  4) Places
External Thesauri and Vocabularies:
  1) MeSH
  2) NCBI Taxonomy, SP2000 Catalogue of Life
  3) CAS/EINECS (REACH, ChemID+)
              2. WIKI application (“SAQ”)
Advices from MDs to others about how to manage
situations linked to the different concepts of the internal
thesauri.
3. CASES application
Data about calls received and cases reviewed.
Internal Thesauri already mentioned.
 4. PROD application: Mixtures sold on BE market
Internal Thesaurus: Substances
External Thesauri and Vocabularies:
(development to be undertaken by a network of Poison Centres)
 1)   CAS/EINECS (REACH, ChemID+)
 2)   Product Usage Categories
                5. CONTACT application
Topic specialists and Products' Manufacturers/Distributors
Internal Thesauri:
  1) Subject thesauri already mentioned
  2) Places
6. ASKOSI: Thesauri based Applications' Manager
Integration under ASKOSI umbrella remains to be done for
applications 3. 4. and 5. above.
ASKOSI.org is an open project to create Java tools to
integrate the benefits of terminology / concept
usages management within applications. It is:
  1. A Java Archive (JAR) providing an API aligned
   on SKOS conceptual organisation to access:
   1. Local or remote vocabularies / thesauri
    (being SKOS or not)
   2. Application data linked to Concepts;
  2. A Web Application to browse gathered
   thesauri (and to manage their interrelations)
Integrating ASKOSI with applications
                                                   Users

                                                                                             External Applications


                                                              Internet Browser                          SKOS RDF or XML
Our Java developments                                                                                   + Usage Statistics
                                                                                      HTTP

Java Open Source software that we
adapt to our needs                                                           Apache Tomcat J2EE

                                                                   Shared ASKOSI.JAR (SKOS Schemes Accesses
         Open Source Software                                           + Usage statistics by applications)

                                                           ASKOSI         Previous           DSpace
                                                             Web           Queries                         Apache
                                                           Applicati      and Data                         JspWiki
                                                              on          Navigator




                                                                         JDBC +
                                                                          SQL                     Search Engine
                                                                                                  Apache SolR +
                                                                                                 Lucene adapted
                                    XML, CSV                        SQL                            to our needs
                                    or RDF files                  Database
The ASKOSI JAR
• API aligned on W3C SKOS data structure
   = JavaBeans in-memory data structure
   = XML Structure http://www.askosi.org/ConceptScheme.xsd
 ◦ISO 25964 will be also considered.
• SQL, CSV, RDF and XML data sources
• Accesses can be Dynamic or Static (periodic reload)
  to import the data sources with SKOS goggles
• Usage statistics: which applications are using which
  SKOS concepts, how (roles) and how many times?
• Designed for data sharing: all applications in the
  same Web Application Container (J2EE) access a
  single copy of the data.
Remote Sources  ASKOSI
• Big thesauri: periodic editions of UMLS,
  Agrovoc, Catalogue OfLife (CoL), etc.:
1.   Parameterize ASKOSI for a static SQL source
2.   Load a local MySQL database with
     the new edition of UMLS/Agrovoc/CoL/...
3.   Reload corresponding schemes
• SKOS/RDF/XML Remote Web Services:
1.   Parameterize ASKOSI for load from
     a remote URL + XSLT transformation
2.   (Auto)reload of corresponding schemes
     (“one concept at a time” must be developed)
Internal Sources  ASKOSI
• Local Authority lists:
   Parameterize ASKOSI for a dynamic SQL
     source: ASKOSI gets data up-to-date.
• Legacy applications:
1.    Parameterize ASKOSI for XML file/URL load
      + XSLT transformation if necessary
2.    Regularly generate the XML file
      with local usage data
3.    (Auto)reload of corresponding schemes
• Little lists or small thesauri:
1.    Parameterise ASKOSI for Excel CSV source.
2. (Auto)reload of corresponding schemes
Parameters to “SKOSify”
                      the SQL Data Source
                 for the WindMusic Thesaurus
type=SQL                                     url=jdbc:postgresql://dbserver:5432/dspace
pool=wind                                    driver=org.postgresql.Driver
title-en=Keywords                            username = dspace
title-fr=Mots-clés                           password = xxxxxxxxx
title-de=Stichwörtern                        validation=SELECT 1 #Oracle: SELECT 1 FROM DUAL
title-es=Palabras claves                     IDdc=select … as key, metadata_field_id as value
title-nl=Trefwoord                                  from metadatafieldregistry;
title.lorthes-en=Keywords                    IDhandle=select … as key, resource_id as value from handle;
…
display-en=http:/dspace/handle/68502/[about]
icon-en=/dspace/image/68502/27.gif
create-en=http:/dspace/submit?post=yes&collection={IDhandle@27}&step=0
…
notation.lorthes=SELECT h.handle AS about, i.text_value AS notation
             from item as m, handle as h, metadatavalue as I
             where i.metadata_field_id={IDdc@identifier.loris}
                    … and m.owning_collection={IDhandle@27}
labels=SELECT h.handle AS about, t.text_value AS label, t.text_lang AS lang
          from item as m, handle as h, metadatavalue as t
          where h.resource_type_id=2
                … and t.metadata_field_id={IDdc@title}
               and m.owning_collection={IDhandle@27}
…alternates…broaders…broadmatches…notes…
The ASKOSI Web Application
• Authority lists browsing:
  ◦ Thesauri trees
  ◦ Alphabetical lists
  ◦ Decreasing Usage Frequency
  ◦Powerful word and string search tool
• SKOS Concepts display in different formats / extents
  ◦ Generation of SKOS RDF
• Validations:
  ◦ Data errors
  ◦ Terminology validations (ambiguity, missing translation)
  ◦ Hierarchy validations (loops, siblings)
• Links to applications using the SKOS concepts
    In development: search history manager, changes approval
    workflow, cross thesauri equivalence relations management .
Open Questions to the Community
•   Users navigating the “Linked Data” Web need concise
    visual clues to decide what to do next, knowing what
    is behind each possible click:
    How could we standardize a visual symbols
    system, the road signs of the Linked Data Web
    and its SKOS roundabouts? (proposals next page)
•   What is behind each concept? what is linked?
    How could we standardize different harvesting
    mechanisms for Concepts Usage Data?
    ◦ “Push” vs “Pull” mode

    ◦ Local vs Remote

    ◦ Absolute vs Incremental

    ◦ Results varying with user authorisations or preferences

•   Linking with URIs allows user side or server side integration of
    applications.
    But between users and applications? Within an application?
 
       

      
          Standardising
          Symbols?

          Your opinion about
          ConceptScheme symbols
          below?




          Detailed proposals available at:
          http://www.destin.be/ASKOSI/Wiki.jsp?page=Icons%20for%20SKOS
Call to Collaborations!
1.   We want to integrate a “voting system” for reviewing SKOS / RDF
     statement contributions, including mappings between thesauri.
     Students welcome!
     Full proposed specs on: http://www.askosi.org/maintenance.pdf
2.   Where could we discuss “Roadsigns for the Linked Data Web”?
3.   Where could we discuss “Concepts Usage Data Harvesting”?
4.   Where could we discuss “Concept References encoding (indexing
     chains) within Applications” ?

                                         christophe.dupriez@destin.be
                                  christophe.dupriez@poisoncentre.be

More Related Content

What's hot

What's hot (20)

Munching & crunching - Lucene index post-processing
Munching & crunching - Lucene index post-processingMunching & crunching - Lucene index post-processing
Munching & crunching - Lucene index post-processing
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data PlatformsCassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
 
SamBO
SamBOSamBO
SamBO
 
Unit 3 MongDB
Unit 3 MongDBUnit 3 MongDB
Unit 3 MongDB
 
SamKK
SamKKSamKK
SamKK
 
elasticsearch
elasticsearchelasticsearch
elasticsearch
 
Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -
 
Artigo no sql x relational
Artigo no sql x relationalArtigo no sql x relational
Artigo no sql x relational
 
Lucene basics
Lucene basicsLucene basics
Lucene basics
 
Apache lucene
Apache luceneApache lucene
Apache lucene
 
A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...
 
Mongo db
Mongo dbMongo db
Mongo db
 
Azure Data Factory usage at Aucfanlab
Azure Data Factory usage at AucfanlabAzure Data Factory usage at Aucfanlab
Azure Data Factory usage at Aucfanlab
 
Introduction to Lucidworks Fusion - Alexander Kanarsky, Lucidworks
Introduction to Lucidworks Fusion - Alexander Kanarsky, LucidworksIntroduction to Lucidworks Fusion - Alexander Kanarsky, Lucidworks
Introduction to Lucidworks Fusion - Alexander Kanarsky, Lucidworks
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...
 
Boosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User PreferencesBoosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User Preferences
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
 

Viewers also liked

Poesía em
Poesía emPoesía em
Poesía em
EVT
 
Poesía medieval
Poesía medievalPoesía medieval
Poesía medieval
EVT
 

Viewers also liked (8)

Publishing Linked Data using Schema.org
Publishing Linked Data using Schema.orgPublishing Linked Data using Schema.org
Publishing Linked Data using Schema.org
 
AKUINO Modular Electronic Systems for Remote Measure and Control
AKUINO Modular Electronic Systems for Remote Measure and ControlAKUINO Modular Electronic Systems for Remote Measure and Control
AKUINO Modular Electronic Systems for Remote Measure and Control
 
Poesía em
Poesía emPoesía em
Poesía em
 
AKUINO Modular Electronic Systems for Remote Measure and Control
AKUINO Modular Electronic Systems for Remote Measure and ControlAKUINO Modular Electronic Systems for Remote Measure and Control
AKUINO Modular Electronic Systems for Remote Measure and Control
 
Poesía medieval
Poesía medievalPoesía medieval
Poesía medieval
 
Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome Economy
 
The Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsThe Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post Formats
 

Similar to Dynamic and repeatable transformation of existing Thesauri and Authority lists into SKOS + Cross-tabulation of Concepts Linked Data

Overview AG AKSW
Overview AG AKSWOverview AG AKSW
Overview AG AKSW
Sören Auer
 
An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...
DataWorks Summit
 

Similar to Dynamic and repeatable transformation of existing Thesauri and Authority lists into SKOS + Cross-tabulation of Concepts Linked Data (20)

Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
963
963963
963
 
Resume of Min Xu
Resume of Min XuResume of Min Xu
Resume of Min Xu
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
The Power of Elasticsearch
The Power of ElasticsearchThe Power of Elasticsearch
The Power of Elasticsearch
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problems
 
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
 
WOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of ThingsWOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of Things
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Overview AG AKSW
Overview AG AKSWOverview AG AKSW
Overview AG AKSW
 
An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...
 
Big Data , Big Problem?
Big Data , Big Problem?Big Data , Big Problem?
Big Data , Big Problem?
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8
 
Explore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth UsingExplore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth Using
 
Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Linked Open Data in the World of Patents
Linked Open Data in the World of Patents
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Dynamic and repeatable transformation of existing Thesauri and Authority lists into SKOS + Cross-tabulation of Concepts Linked Data

  • 1. Integrating applications & projects = Dynamic & repeatable transformation of existing Thesauri and Authority lists into SKOS + Cross-tabulation of Concepts Linked Data Presentation to the Linked Data Meeting University College of London, September 14th 2010 by Christophe Dupriez, Destin SSEB, dupriez@destin.be working for Belgium Poison Centre rue Bruyn 1, B-1120 Brussels (Belgium)
  • 2. The main request from Users: Whenever a concept is mentioned, concise visual clues about: • Where it comes from? (e.g. : substances) • Where is it also mentioned? (e.g. : in MDs Wiki+paper files) • For which role? (e.g. This substance, is it a problem or a cure? ) • About how many times is it mentioned? (e.g. =893 bibliographic records, 203 products, 2180 calls, 65 medical reports) + single click to access any of those when desired.
  • 3. Use Case : Current Awareness From a list of subjects and document types (e.g. reviews, case reports…) : • filter remote sources (e.g. PubMed) • help index new records with our vocabularies. We must manage equivalences between our thesauri and remote ones (e.g. MeSH) (browse, display, validate, update...) Our Concept MeSH Heading EQ & NTs... EQ~ & NTs... BTM NTM
  • 4. Use Case: Emergencies words Concepts Data Feedback 1) Powerful word search 2) Information to discriminate between Concepts 3) Identified Concepts = Clues to gather Linked Data 4) Managing sets of data for different Clues 5) Browsing Data to discriminate between Hypothesis Support MDs to build their recommendations; Analysis of Events for Toxico-Vigilance
  • 5. Benefits of integrating SKOS (terminology and concepts management) in all applications of users' workbench • Multilingual data and user interface. • Exhaustivity: Searches retrieves specifics, synonyms, translations and equivalent concepts in other “aligned” thesauri. • Precision: precise result for a given concept • Strongly validated updates; Data entry helped by Auto- complete • Better metadata model, easier to maintain
  • 6. Benefits of integrating Concepts' Usages information in all applications of users' workbench • : concept's references enriched with statistics and links to places where they are also used. • Promotes direct linking from a concept to its usages within applications. • Promotes homogenous display and functionalities to create, display, update, link, unlink concepts to applications elements. • Usage statistics (and search link) near each mention of a concept (passage from one application to another) • Better metadata model, easier to maintain
  • 7. 1. BIBL application: Articles about Human Toxicology Internal Thesauri (Subject Vocabularies): 1) Substances 2) Living beings (plants, animals, mushrooms...) 3) Symptoms, Treatments 4) Places External Thesauri and Vocabularies: 1) MeSH 2) NCBI Taxonomy, SP2000 Catalogue of Life 3) CAS/EINECS (REACH, ChemID+) 2. WIKI application (“SAQ”) Advices from MDs to others about how to manage situations linked to the different concepts of the internal thesauri.
  • 8. 3. CASES application Data about calls received and cases reviewed. Internal Thesauri already mentioned. 4. PROD application: Mixtures sold on BE market Internal Thesaurus: Substances External Thesauri and Vocabularies: (development to be undertaken by a network of Poison Centres) 1) CAS/EINECS (REACH, ChemID+) 2) Product Usage Categories 5. CONTACT application Topic specialists and Products' Manufacturers/Distributors Internal Thesauri: 1) Subject thesauri already mentioned 2) Places
  • 9. 6. ASKOSI: Thesauri based Applications' Manager Integration under ASKOSI umbrella remains to be done for applications 3. 4. and 5. above. ASKOSI.org is an open project to create Java tools to integrate the benefits of terminology / concept usages management within applications. It is: 1. A Java Archive (JAR) providing an API aligned on SKOS conceptual organisation to access: 1. Local or remote vocabularies / thesauri (being SKOS or not) 2. Application data linked to Concepts; 2. A Web Application to browse gathered thesauri (and to manage their interrelations)
  • 10. Integrating ASKOSI with applications Users External Applications Internet Browser SKOS RDF or XML Our Java developments + Usage Statistics HTTP Java Open Source software that we adapt to our needs Apache Tomcat J2EE Shared ASKOSI.JAR (SKOS Schemes Accesses Open Source Software + Usage statistics by applications) ASKOSI Previous DSpace Web Queries Apache Applicati and Data JspWiki on Navigator JDBC + SQL Search Engine Apache SolR + Lucene adapted XML, CSV SQL to our needs or RDF files Database
  • 11. The ASKOSI JAR • API aligned on W3C SKOS data structure = JavaBeans in-memory data structure = XML Structure http://www.askosi.org/ConceptScheme.xsd ◦ISO 25964 will be also considered. • SQL, CSV, RDF and XML data sources • Accesses can be Dynamic or Static (periodic reload) to import the data sources with SKOS goggles • Usage statistics: which applications are using which SKOS concepts, how (roles) and how many times? • Designed for data sharing: all applications in the same Web Application Container (J2EE) access a single copy of the data.
  • 12. Remote Sources  ASKOSI • Big thesauri: periodic editions of UMLS, Agrovoc, Catalogue OfLife (CoL), etc.: 1. Parameterize ASKOSI for a static SQL source 2. Load a local MySQL database with the new edition of UMLS/Agrovoc/CoL/... 3. Reload corresponding schemes • SKOS/RDF/XML Remote Web Services: 1. Parameterize ASKOSI for load from a remote URL + XSLT transformation 2. (Auto)reload of corresponding schemes (“one concept at a time” must be developed)
  • 13. Internal Sources  ASKOSI • Local Authority lists: Parameterize ASKOSI for a dynamic SQL source: ASKOSI gets data up-to-date. • Legacy applications: 1. Parameterize ASKOSI for XML file/URL load + XSLT transformation if necessary 2. Regularly generate the XML file with local usage data 3. (Auto)reload of corresponding schemes • Little lists or small thesauri: 1. Parameterise ASKOSI for Excel CSV source. 2. (Auto)reload of corresponding schemes
  • 14. Parameters to “SKOSify” the SQL Data Source for the WindMusic Thesaurus type=SQL url=jdbc:postgresql://dbserver:5432/dspace pool=wind driver=org.postgresql.Driver title-en=Keywords username = dspace title-fr=Mots-clés password = xxxxxxxxx title-de=Stichwörtern validation=SELECT 1 #Oracle: SELECT 1 FROM DUAL title-es=Palabras claves IDdc=select … as key, metadata_field_id as value title-nl=Trefwoord from metadatafieldregistry; title.lorthes-en=Keywords IDhandle=select … as key, resource_id as value from handle; … display-en=http:/dspace/handle/68502/[about] icon-en=/dspace/image/68502/27.gif create-en=http:/dspace/submit?post=yes&collection={IDhandle@27}&step=0 … notation.lorthes=SELECT h.handle AS about, i.text_value AS notation from item as m, handle as h, metadatavalue as I where i.metadata_field_id={IDdc@identifier.loris} … and m.owning_collection={IDhandle@27} labels=SELECT h.handle AS about, t.text_value AS label, t.text_lang AS lang from item as m, handle as h, metadatavalue as t where h.resource_type_id=2 … and t.metadata_field_id={IDdc@title} and m.owning_collection={IDhandle@27} …alternates…broaders…broadmatches…notes…
  • 15. The ASKOSI Web Application • Authority lists browsing: ◦ Thesauri trees ◦ Alphabetical lists ◦ Decreasing Usage Frequency ◦Powerful word and string search tool • SKOS Concepts display in different formats / extents ◦ Generation of SKOS RDF • Validations: ◦ Data errors ◦ Terminology validations (ambiguity, missing translation) ◦ Hierarchy validations (loops, siblings) • Links to applications using the SKOS concepts In development: search history manager, changes approval workflow, cross thesauri equivalence relations management .
  • 16.
  • 17. Open Questions to the Community • Users navigating the “Linked Data” Web need concise visual clues to decide what to do next, knowing what is behind each possible click: How could we standardize a visual symbols system, the road signs of the Linked Data Web and its SKOS roundabouts? (proposals next page) • What is behind each concept? what is linked? How could we standardize different harvesting mechanisms for Concepts Usage Data? ◦ “Push” vs “Pull” mode ◦ Local vs Remote ◦ Absolute vs Incremental ◦ Results varying with user authorisations or preferences • Linking with URIs allows user side or server side integration of applications. But between users and applications? Within an application?
  • 18.          Standardising Symbols? Your opinion about ConceptScheme symbols below? Detailed proposals available at: http://www.destin.be/ASKOSI/Wiki.jsp?page=Icons%20for%20SKOS
  • 19. Call to Collaborations! 1. We want to integrate a “voting system” for reviewing SKOS / RDF statement contributions, including mappings between thesauri. Students welcome! Full proposed specs on: http://www.askosi.org/maintenance.pdf 2. Where could we discuss “Roadsigns for the Linked Data Web”? 3. Where could we discuss “Concepts Usage Data Harvesting”? 4. Where could we discuss “Concept References encoding (indexing chains) within Applications” ? christophe.dupriez@destin.be christophe.dupriez@poisoncentre.be