SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
Modeling the Complexity of Music Metadata
in Semantic Graphs for Exploration and Discovery
ANR-14-CE24-0020
@pasqlisena
pasquale.lisena@eurecom.fr
Pasquale Lisena, Raphaël Troncy, Konstantin Todorov, Manel Achichi
Digital Libraries for Musicology (DLfM) Workshop
28th October 2017 | Shanghai Conservatory of Music
https://list.indiana.edu/sympa/arc/mla-l/2017-08/msg00248.html
2
Information contained in librarian knowledge
but not publicly available
Hard question for current
music models and ontologies
Different practical implications
(MIR, concert and radio programming, music recommendation)
3
4
Project Goals • Improve music description to foster
music exchange and reuse
• Connect sources, multiply usage,
enrich user experience
• Music specific data model
• Vocabularies and data public available as
Linked Open Data
• Tools for visualization, interconnections,
recommendation
• Experience and praxis for other institutions
5
Works
62 550 | XML
Scores
9 154 | XML
Concerts
340 609 | XML
Discs
9 500 | XML
Works
6 846 | UNIMARC
Scores
30 319 | UNIMARC
Concerts
5 164 | XML
Discs
8 602 | XML
Source Datasets
Works
135 940 | INTERMARC
Scores
89 184 | INTERMARC
6
Source Datasets
DATASET
Works
Scores
Concerts
Discs
Classic work
Jazz improvisation
Ethnic/World/Traditional music
How to manage this
complex metadata?
7
State of the Art: MusicOntology
- One of the first example of describing
music using Semantic Web
- Extend FRBR, Timeline Ontology,
Event Ontology
- Uses vocabularies for Keys, Musical
Instrument (by MusicBrainz), Genres
(DBpedia)
8
Raimond, Samer A. Abdallah, Mark B. Sandler, and Frederick Giasson. 2007. The Music Ontology. In 15th
International Conference on Music Information Retrieval (ISMIR). 417–422
The DOREMUS model
F15
Work
F22
Expression
F28
Expression
Creation
- Music specific extension of
FRBRoo
- Triplet pattern:
Work-Expression-Event
- Dynamic:
every triplet is autonomous, and
linkable to the other ones
- Relies on Linked Data principles
(everything is an URI,
RDF model)
9http://data.doremus.org/ontology
F14
Work
F22
Expression
M2
Opus
Statement
F28
Expression
Creation
R3 is
realized in
E7
Activity
5
1
“Sonate pour violoncelle et piano no 1”@fr
“Sonates" , "Sonata in F"
Ludwig van
Beethoven
Ludwig von Beethoven
composer
compositeur@fr
compositore@it
U17 has opus
statement
U12 has
genre
P102 has title
U31 had
function of
type
P14 carried
out by
P9 consists
of
P4 has time
span1796
Sonata
sonata@it , sonate@fr ,
klaviersonate@de
M42 Performed
Expression
Creation
M43
Performed
Expression
Berlin
P4 has time
span
1796
P7 took
place at
F24 Publication
Expression
F30
Publication
Event
P4 has
time span
1797
P7 took
place at
Vienna
U4 had princeps
publication
U54 is performed
expression of
P165
incorporates
1770
1827
P98
born
P100
died
F Major
F Dur@de , Fa majeur@fr,
Fa maggiore@it , Fa mayor@es
M6
Casting
M23
Casting
Detail 1
U30
quantity
U2
foresees
mop
Piano
Pianoforte@it
Fortepian@pl
M23
Casting
Detail
1
U30
quantity
U2
foresees
mop
Cello
Violoncello@it
Violoncelle@fr
F15
Complex
Work
F19
Publication
Work
M44
Performed
Work
U5 had
premiere
U38 has
descriptive
expression
R10 has member
11
Controlled Vocabularies
12
“Sax”@en
“Saxophone”@en
“Saxofone”@pt
“Sassofono”@it
“Saxophone”@fr
Alternate labels Alternate languages
<http://data.doremus.org/vocabulary/iaml/mop/wsa>
“English term is preferred globally”
Notes
“Woodwinds”@en
“Legni”@it
Hierarchy
“Baritone Saxophone”@en• Disambiguation
• Search
• Graph-based analysis
APPLICATIONS
Controlled Vocabularies
13
GENRES
Diabolo (629)
IAML (607)
Itema3 (212)
Redomi (313)
RAMEAU (654)
Medium of
performance
MIMO (2480)
Itema3 (314)
IAML (419)
Diabolo (2117)
RAMEAU (876)
Redomi (179)
Musical keys
29
Modes
22
Catalogues
151 Derivation types
16
Functions
~ 30
coming soon
http://data.doremus.org/vocabularies
Interlinking: Vocabularies
14
http://data.doremus.org/
vocabulary/iaml/genre/cha
“cha-cha-cha”
http://data.doremus.org/
vocabulary/diabolo/genre/cha_cha_cha
“cha cha cha”
http://yamplusplus.lirmm.fr/
=
String matching +
graph traversal
Interface for validating
the matching
001 FRBNF139081882FR
100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827
144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur
LANG TITLE MOP OPUS KEY
“MARC must die” -- Roy Tennant, 2002
http://lj.libraryjournal.com/2002/10/ljarchives/marc-must-die/#_
MARC issues
16
• Different variants
UNIMARC, INTERMARC
• Free text field
different practices in describing the same information
“Op. 27 n. 2” - “Op. 27 no 2”
• Frequent mistakes in editorial work
wrong fields, typos, wrong punctuation
Data conversion
marc2rdf
experts-made
mapping rules
17
controlled
vocabularies
https://github.com/DOREMUS-ANR/marc2rdf/
• Field parsing and mapping
• NLP techniques
• Graph generation
• String2URI
TASKS
Interlinking: Works
18
http://data.doremus.org/expression/d72
301f0-0aba-3ba6-93e5-c4efbee9c6ea
“Sonata quasi una fantasia”
http://data.doremus.org/expression/226790
01-2cd0-3f84-b502-0f337429966f
“Quasi una fantasia”
https://github.com/DOREMUS-ANR/legato
=
Legato F-measure > 0.85
Precision > 0.87
Recall > 0.82
Interlinking: Works
19
1. Data cleaning
removing “noisy” properties, i.e. identifiers, comments, …
2. Instance profiling
represent each resource as sub-graph
3. Instance indexing and matching
convert the sub-graph in a set of keywords in order to
apply text document matching techniques
4. Post-processing
Clustering of the datasets, identify false positive of
previous points
Visualizing
20http://overture.doremus.org
Prototype of web app that
uses the DOREMUS dataset
• Follow the links
like in the graph
• Enriched experience
DBpedia, GeoNames, …
• Timeline of related event
• Similar works
recommendation
Future Work
21
• Pivot Vocabularies of Genres and MoPs
as result of the interconnection task
• Recommendation System
first step: “Combining Music Specific Embeddings for
Computing Artist Similarity” @ISMIR2017
• Schema.org injection in all pages
goals: SEO optimization, simplification of the data in
order to extend their usage
22
But what about this?
23
results
This and more questions:
https://github.com/DOREMUS-ANR/knowledge-base/tree/master/query-examples
Links
http://www.doremus.org/
DOREMUS Website
GitHub page
with tools, converters, ontologies, ...
https://github.com/DOREMUS-ANR/
Dataset & SPARQL Endpoint
https://data.doremus.org/sparql
https://data.doremus.org/fct
OVERTURE
https://overture.doremus.org/
This presentation
https://www.slideshare.net/squalelis
24

Más contenido relacionado

Similar a Modeling the Complexity of Music Metadata in Semantic Graphs for Exploration and Discovery

Defining New Languages For Querying Rdf Data
Defining New Languages For Querying Rdf DataDefining New Languages For Querying Rdf Data
Defining New Languages For Querying Rdf Data
Amber Rodriguez
 

Similar a Modeling the Complexity of Music Metadata in Semantic Graphs for Exploration and Discovery (20)

"Will you play upon this"?: Designing Auditory Displays for Early Modern Drama
"Will you play upon this"?: Designing Auditory Displays for Early Modern Drama"Will you play upon this"?: Designing Auditory Displays for Early Modern Drama
"Will you play upon this"?: Designing Auditory Displays for Early Modern Drama
 
Discovering music: small-scale, web-scale, facets, and beyond-Belford
Discovering music: small-scale, web-scale, facets, and beyond-BelfordDiscovering music: small-scale, web-scale, facets, and beyond-Belford
Discovering music: small-scale, web-scale, facets, and beyond-Belford
 
Singing planting wheat, a song recorded in the field in 1964 by Marceau Gast ...
Singing planting wheat, a song recorded in the field in 1964 by Marceau Gast ...Singing planting wheat, a song recorded in the field in 1964 by Marceau Gast ...
Singing planting wheat, a song recorded in the field in 1964 by Marceau Gast ...
 
A virtual jukebox for europe's sound heritage
A virtual jukebox for europe's sound heritageA virtual jukebox for europe's sound heritage
A virtual jukebox for europe's sound heritage
 
Defining New Languages For Querying Rdf Data
Defining New Languages For Querying Rdf DataDefining New Languages For Querying Rdf Data
Defining New Languages For Querying Rdf Data
 
Crowdsourcing and Semantic Enrichments for European Cultural Heritage
Crowdsourcing and Semantic Enrichments for European Cultural HeritageCrowdsourcing and Semantic Enrichments for European Cultural Heritage
Crowdsourcing and Semantic Enrichments for European Cultural Heritage
 
Caroline Ardrey (University of Birmingham)
Caroline Ardrey (University of Birmingham)Caroline Ardrey (University of Birmingham)
Caroline Ardrey (University of Birmingham)
 
Analysis Synthesis Comparison
Analysis Synthesis ComparisonAnalysis Synthesis Comparison
Analysis Synthesis Comparison
 
Alenka Šauperl: Abstracts for scientific papers
Alenka Šauperl: Abstracts for scientific papers Alenka Šauperl: Abstracts for scientific papers
Alenka Šauperl: Abstracts for scientific papers
 
Back to basics: music interlending
Back to basics: music interlendingBack to basics: music interlending
Back to basics: music interlending
 
Lexicography and Lexicology from a Pan-European Perspective: COST ENeL Workin...
Lexicography and Lexicology from a Pan-European Perspective: COST ENeL Workin...Lexicography and Lexicology from a Pan-European Perspective: COST ENeL Workin...
Lexicography and Lexicology from a Pan-European Perspective: COST ENeL Workin...
 
Europeana Sounds: linking Europe's digital sound archives
Europeana Sounds: linking Europe's digital sound archivesEuropeana Sounds: linking Europe's digital sound archives
Europeana Sounds: linking Europe's digital sound archives
 
Archiving and disseminating sound archives – 1. Processes and procedures in...
Archiving and disseminating  sound archives – 1. Processes and procedures  in...Archiving and disseminating  sound archives – 1. Processes and procedures  in...
Archiving and disseminating sound archives – 1. Processes and procedures in...
 
Treasuring the sound heritage: the Europeana Sounds project
Treasuring the sound heritage: the Europeana Sounds projectTreasuring the sound heritage: the Europeana Sounds project
Treasuring the sound heritage: the Europeana Sounds project
 
Introduction to Fast by Professor Mark Sandler
Introduction to Fast by  Professor Mark SandlerIntroduction to Fast by  Professor Mark Sandler
Introduction to Fast by Professor Mark Sandler
 
Analyzing DJ Scratching and its Acoustics
Analyzing DJ Scratching and its Acoustics Analyzing DJ Scratching and its Acoustics
Analyzing DJ Scratching and its Acoustics
 
Infrastructure - A necessary platform for user empowerment
Infrastructure - A necessary platform for user empowermentInfrastructure - A necessary platform for user empowerment
Infrastructure - A necessary platform for user empowerment
 
ESF Strasbourg Peter Doorn October 2010
ESF Strasbourg Peter Doorn October 2010ESF Strasbourg Peter Doorn October 2010
ESF Strasbourg Peter Doorn October 2010
 
Making the Leap Towards Linked Data
Making the Leap Towards Linked DataMaking the Leap Towards Linked Data
Making the Leap Towards Linked Data
 
Breaking the Waves - Alastair Dunning
Breaking the Waves - Alastair DunningBreaking the Waves - Alastair Dunning
Breaking the Waves - Alastair Dunning
 

Último

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
Kamal Acharya
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 

Último (20)

data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptx
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech Civil
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 

Modeling the Complexity of Music Metadata in Semantic Graphs for Exploration and Discovery

  • 1. Modeling the Complexity of Music Metadata in Semantic Graphs for Exploration and Discovery ANR-14-CE24-0020 @pasqlisena pasquale.lisena@eurecom.fr Pasquale Lisena, Raphaël Troncy, Konstantin Todorov, Manel Achichi Digital Libraries for Musicology (DLfM) Workshop 28th October 2017 | Shanghai Conservatory of Music
  • 3. Information contained in librarian knowledge but not publicly available Hard question for current music models and ontologies Different practical implications (MIR, concert and radio programming, music recommendation) 3
  • 4. 4
  • 5. Project Goals • Improve music description to foster music exchange and reuse • Connect sources, multiply usage, enrich user experience • Music specific data model • Vocabularies and data public available as Linked Open Data • Tools for visualization, interconnections, recommendation • Experience and praxis for other institutions 5
  • 6. Works 62 550 | XML Scores 9 154 | XML Concerts 340 609 | XML Discs 9 500 | XML Works 6 846 | UNIMARC Scores 30 319 | UNIMARC Concerts 5 164 | XML Discs 8 602 | XML Source Datasets Works 135 940 | INTERMARC Scores 89 184 | INTERMARC 6
  • 7. Source Datasets DATASET Works Scores Concerts Discs Classic work Jazz improvisation Ethnic/World/Traditional music How to manage this complex metadata? 7
  • 8. State of the Art: MusicOntology - One of the first example of describing music using Semantic Web - Extend FRBR, Timeline Ontology, Event Ontology - Uses vocabularies for Keys, Musical Instrument (by MusicBrainz), Genres (DBpedia) 8 Raimond, Samer A. Abdallah, Mark B. Sandler, and Frederick Giasson. 2007. The Music Ontology. In 15th International Conference on Music Information Retrieval (ISMIR). 417–422
  • 9. The DOREMUS model F15 Work F22 Expression F28 Expression Creation - Music specific extension of FRBRoo - Triplet pattern: Work-Expression-Event - Dynamic: every triplet is autonomous, and linkable to the other ones - Relies on Linked Data principles (everything is an URI, RDF model) 9http://data.doremus.org/ontology
  • 10. F14 Work F22 Expression M2 Opus Statement F28 Expression Creation R3 is realized in E7 Activity 5 1 “Sonate pour violoncelle et piano no 1”@fr “Sonates" , "Sonata in F" Ludwig van Beethoven Ludwig von Beethoven composer compositeur@fr compositore@it U17 has opus statement U12 has genre P102 has title U31 had function of type P14 carried out by P9 consists of P4 has time span1796 Sonata sonata@it , sonate@fr , klaviersonate@de M42 Performed Expression Creation M43 Performed Expression Berlin P4 has time span 1796 P7 took place at F24 Publication Expression F30 Publication Event P4 has time span 1797 P7 took place at Vienna U4 had princeps publication U54 is performed expression of P165 incorporates 1770 1827 P98 born P100 died F Major F Dur@de , Fa majeur@fr, Fa maggiore@it , Fa mayor@es M6 Casting M23 Casting Detail 1 U30 quantity U2 foresees mop Piano Pianoforte@it Fortepian@pl M23 Casting Detail 1 U30 quantity U2 foresees mop Cello Violoncello@it Violoncelle@fr F15 Complex Work F19 Publication Work M44 Performed Work U5 had premiere U38 has descriptive expression R10 has member
  • 11. 11
  • 12. Controlled Vocabularies 12 “Sax”@en “Saxophone”@en “Saxofone”@pt “Sassofono”@it “Saxophone”@fr Alternate labels Alternate languages <http://data.doremus.org/vocabulary/iaml/mop/wsa> “English term is preferred globally” Notes “Woodwinds”@en “Legni”@it Hierarchy “Baritone Saxophone”@en• Disambiguation • Search • Graph-based analysis APPLICATIONS
  • 13. Controlled Vocabularies 13 GENRES Diabolo (629) IAML (607) Itema3 (212) Redomi (313) RAMEAU (654) Medium of performance MIMO (2480) Itema3 (314) IAML (419) Diabolo (2117) RAMEAU (876) Redomi (179) Musical keys 29 Modes 22 Catalogues 151 Derivation types 16 Functions ~ 30 coming soon http://data.doremus.org/vocabularies
  • 15. 001 FRBNF139081882FR 100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827 144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur LANG TITLE MOP OPUS KEY “MARC must die” -- Roy Tennant, 2002 http://lj.libraryjournal.com/2002/10/ljarchives/marc-must-die/#_
  • 16. MARC issues 16 • Different variants UNIMARC, INTERMARC • Free text field different practices in describing the same information “Op. 27 n. 2” - “Op. 27 no 2” • Frequent mistakes in editorial work wrong fields, typos, wrong punctuation
  • 17. Data conversion marc2rdf experts-made mapping rules 17 controlled vocabularies https://github.com/DOREMUS-ANR/marc2rdf/ • Field parsing and mapping • NLP techniques • Graph generation • String2URI TASKS
  • 18. Interlinking: Works 18 http://data.doremus.org/expression/d72 301f0-0aba-3ba6-93e5-c4efbee9c6ea “Sonata quasi una fantasia” http://data.doremus.org/expression/226790 01-2cd0-3f84-b502-0f337429966f “Quasi una fantasia” https://github.com/DOREMUS-ANR/legato = Legato F-measure > 0.85 Precision > 0.87 Recall > 0.82
  • 19. Interlinking: Works 19 1. Data cleaning removing “noisy” properties, i.e. identifiers, comments, … 2. Instance profiling represent each resource as sub-graph 3. Instance indexing and matching convert the sub-graph in a set of keywords in order to apply text document matching techniques 4. Post-processing Clustering of the datasets, identify false positive of previous points
  • 20. Visualizing 20http://overture.doremus.org Prototype of web app that uses the DOREMUS dataset • Follow the links like in the graph • Enriched experience DBpedia, GeoNames, … • Timeline of related event • Similar works recommendation
  • 21. Future Work 21 • Pivot Vocabularies of Genres and MoPs as result of the interconnection task • Recommendation System first step: “Combining Music Specific Embeddings for Computing Artist Similarity” @ISMIR2017 • Schema.org injection in all pages goals: SEO optimization, simplification of the data in order to extend their usage
  • 23. 23 results This and more questions: https://github.com/DOREMUS-ANR/knowledge-base/tree/master/query-examples
  • 24. Links http://www.doremus.org/ DOREMUS Website GitHub page with tools, converters, ontologies, ... https://github.com/DOREMUS-ANR/ Dataset & SPARQL Endpoint https://data.doremus.org/sparql https://data.doremus.org/fct OVERTURE https://overture.doremus.org/ This presentation https://www.slideshare.net/squalelis 24