SlideShare una empresa de Scribd logo
1 de 44
Descargar para leer sin conexión
DOREMUSa Graph of Interlinked Musical Work
Pasquale Lisena
EURECOM, France
@pasqLisena
M. Achichi, P. Lisena, K. Todorov, R. Troncy, J. Delahousse
2
Which works have been composed
by Mozart when he was <10?
How many works have been composed and
performed for the 1st time in the same city?
Which composers had the chance to
direct their own work in a performance
during the last decade?
3
metadata about
artists, works, performances, scores
Music
knowledge graph
used for building the knowledge graph
open-source, reusable
Tools for converting
and interlinking
4
Music is complex
5
M. Lasar (2011). Digging into Pandora’s Music Genome with musicologist Nolan Gasser.
https://arstechnica.com/tech-policy/2011/01/digging-into-pandoras-music-genome-with-musicologist-nolan-gasser/
When it comes to classical
music, on the other hand, it's
much more about the
composition itself, because
even though the interpretation
can vary in various subtle ways.
CLASSICALPOP VS
For pop music the experience of
the music is really defined by
the recording.
6
CLASSICALPOP VS
Track-based Work-based
60 years of history
Thousand years
from Gregorian chant to a work written last
Tuesday
Songs Multi-movement works
Major, minor
Polyphonic, homophonic,
monophonic
7
8
Music archives have
very detailed knowledge
PROBLEMS
● Multiple formats
● No possible interoperability
● Need for discovering overlapping knowledge
● Information codified as free text
● Not always publicly accessible
APPROACH
Semantic Web!
9
Improve music description to foster
music exchange and reuse
Travel to the heart of the musical
archives in France’s greatest
institutions
Connect sources, multiply usage,
enrich user experience
10
Building the
DOREMUS graph
DATA CONVERSION
DATA LINKING
LINK VALIDATION
DATA MODELING
marc2rdf
string2vocabulary
...custom converters
legato
DATA CONVERSION DATA LINKING LINK VALIDATION
11
The DOREMUS Model
- Music specific extension of FRBRoo
- Dynamic: it is made up of autonomous
combined modules
- Relies on Linked Data principles
(everything is an URI, RDF model)
FRBR
museum
information
bibliographic
records
DATA MODELING
Choffé, Pierre, and Françoise Leresche. DOREMUS: connecting sources, enriching
catalogues and user experience. In 24th IFLA World Library and Information
Congress. 2016.
12
The building blocks
Work-Expression-Event
F14
Work
F22
Expression
F28
Expression
Creation
R3 is realized in
R17 created
R19 created a
realization of
DATA CONVERSION DATA LINKING LINK VALIDATIONDATA MODELING
13
F14
Work
F22
Expression
M2
Opus
Statement
F28
Expression
Creation
R3 is
realized in
E7
Activity
5
1
“Sonate pour violoncelle et piano no 1”@fr
“Sonates" , "Sonata in F"
Ludwig van
Beethoven
Ludwig von Beethoven
composer
compositeur@fr
compositore@it
R17 created
R19createda
realizationof
U17 has opus
statement
U12 has
genre
P102 has title
U31 had
function of
type
P14 carried
out by
P9
consists of
P4 has time
span1796
Sonata
sonata@it , sonate@fr ,
klaviersonate@de
M42 Performed
Expression
Creation
M43
Performed
Expression
Berlin
P4 has time
span
1796
P7 took
place at
F24 Publication
Expression
F30
Publication
Event
P4 has time
span
1797
P7 took
place at
Vienna
U4 had princeps
publication
U54 is performed
expression of
P165
incorporates
1770
1827
P98
born
P100
died
U11 has key
F Major
F Dur@de , Fa majeur@fr,
Fa maggiore@it , Fa mayor@es
M6
Casting
M23
Casting
Detail
U13
has
casting
1
U30
quantity
U2
foresees
mop
Piano
Pianoforte@it
Fortepian@pl
M23
Casting
Detail
1
U30
quantity
U2
foresees
mop
Cello
Violoncello@it
Violoncelle@fr
F15
Complex
Work
F19
Publication
Work
M44
Performed
Work
U5 had
premiere
U38 has
descriptive
expression
R10 has member
14
F22
Expression
M6
Casting
M23
Casting
Detail
U13
has
casting
1
U30
quantity
U2
foresees
mop
Piano
Pianoforte@it
Fortepian@pl
M23
Casting
Detail
1
U30
quantity
U2
foresees
mop
Cello
Violoncello@it
Violoncelle@fr
Controlled Vocabularies for Music
Metadata
GENRES
Diabolo
IAML
Itema3
Redomi
RAMEAU
Medium of performance
MIMO
Itema3
IAML
Diabolo
RAMEAU
Redomi
Musical keys
Modes
Catalogues
Derivation types
Functions
more available at
http://data.doremus.org/vocabularies
23 families of vocabularies · 11,000+ concepts · 610 links between terms
published at ISMIR 2018
INTERLINKED
INTERLINKED
16
Dealing with
different formats
Works: INTERMARC
Scores: INTERMARC
Discs: INTERMARC
Works: UNIMARC
Scores: INTERMARC
Performances: XML
Works - Recordings - Scores
3 different XML sources
A pre-digital archive format in Radio France
DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
Source datasets
17
Works
62 550 | XML
Scores
9 154 | XML
Concerts
340 609 | XML
Discs
9 500 | XML
Works
6 846 | UNIMARC
Scores
30 319 | UNIMARC
Concerts
5 164 | XML
Discs
8 602 | XML
Works
135 940 | INTERMARC
Scores
89 184 | INTERMARC
Source datasets
18
DATASET
Works
Scores
Concerts
Discs
Classic work
Jazz improvisation
Ethnic/World/Traditional music
19
001 FRBNF139081882FR
100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827
144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur
001 FRBNF139081882FR
100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827
144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur
LANG TITLE MOP OPUS KEY
MARC FILE
MARC must die
http://lj.libraryjournal.com/2002/10/ljarchives/marc-must-die
“ Roy Tennant, 2002
”
DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
20
marc2rdf
MARC PARSER
● Parsing of the file
● Interpretation of the fields
● Graph generation
MARC
files
mapping
rules
DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
21
144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur
F22 Expression: Opus Number
F22 Self-Contained Expression
U17 has opus statement M2 Opus Statement
[U42 has opus number M12 Opus Number]
+ [U43 has opus subnumber M13 Opus Subnumber]
TUM : 144 $p, chain of digits
TUM : 144 $p, chain of digits before the comma
Remove the abbreviation “Op.” before the number
144 $pOp. 352 --> M12 = 352
144 $pOp. 27, no 2 --> M12 = 27, M13 =2
UNIT OF INFORMATION
PATH
INTERMARC BNF
TRANSFER RULE
EXAMPLE
MAPPING
RULES
DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
22
marc2rdf
http://data.doremus.org/performance/8abb8e71-1593-36b9-a998-80b437258ef4
MARC PARSER
FREE TEXT
INTERPRETER
MARC
files
vocabularies
1st performance in Moscow, December 29, 1956,
by Mstislav Rostropovich on cello and A. Dedukhin on piano
“ ”
● Extracting info from the text
through empirical rules
● Disambiguation for
vocabularies terms and
artists
DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
23
marc2rdf
MARC PARSER
FREE TEXT
INTERPRETER
STRING 2
VOCABULARY
● Replace labels with URIs from
controlled vocabularies
MARC
files
vocabularies
“Violoncelle”@fr <http://www.mimo-db.eu/InstrumentsKeywords/3582>
DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
24
STRING 2 VOCABULARY
● Match against a family of vocabularies
“Soprano”@it
MIMO IAML DIABOLO ITEMA3 REDOMI RAMEAU
GENRE
“C Major”@en
GENRE
vocabulary:key/c
KEY
vocabulary:key/c
https://github.com/DOREMUS-ANR/string2vocabulary
● 2 passes
○ Exact label + language
○ Exact label, any language
● Correction of editorial mistakes
DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
25
INTERMARC
marc2rdf
UNIMARC
EUTERPE
XML
ITEMA3
XML
euterpe
converter
itema3
converter
GRAPH
BNF
GRAPH
PHILHARMONIE GRAPH EUTERPE GRAPH ITEMA3
diabolo
converter
DIABOLO
XML
GRAPH DIABOLO
DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
STRING 2 VOCABULARY
26
GRAPH BNF GRAPH PHILHARMONIE
http://data.doremus.org/expression/d72301f0-0aba-
3ba6-93e5-c4efbee9c6ea
“Quasi una fantasia”
COMPOSER Beethoven
ORDER NUM 14
OPUS 27, n 2
GENRE sonata
CASTING piano
KEY C sharp major
1st PUB ?
PREMIERE ?
http://data.doremus.org/expression/37932fbc-fef3-3edb-
9fae-1eec9b4be01d
“Sonata quasi una fantasia”
COMPOSER Beethoven
ORDER NUM 14
OPUS 27, n 2
GENRE sonata, romantic music
CASTING piano (1)
KEY C sharp major
1st PUB 1802, Vienna
PREMIERE ?
sameAs
27
DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING
Challenges
● Not all the works have values for all the
properties
lack of attributes
● Similar values do not necessarily imply a
match
i.e. Beethoven’s Sonata n. 1, Sonata n. 2, Sonata n. 3
● Lexical, semantic, transliteration,
orthographic mismatches
On the left: Beethoven.
On the right: (the same) Beethoven.
28
DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING
First Linking
Composer + Catalogue
Wolfgang Amadeus Mozart
Eine kleine Nachtmusik K 525
Wolfgang Amadeus Mozart
Serenade No. 13 in G major KV 525
sameAs
29
DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING
Legato
New linking system
Existing data linking system were not satisfactory
30
DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING
* works to be compared are grouped by composer
*
31
DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING
32
DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING
Heterogeneities Task False Positive Trap
Legato performances at the
OAEI campaign 2017
sandbox mainbox
SPIMBENCHDOREMUS
33
DATA LINKINGDATA MODELING DATA CONVERSION LINK VALIDATION
certain links
confidence score +
experts’ validation
?
SINGLE LINK TRIANGLE MISSING LINK CONFLICT
inference if
experts’ validation
remove with
experts’ check
34
What is in the Knowledge Graph?
89.872
persons
(composers,
performers, …)
18.075
corporate bodies
(orchestras, chorus,
publishers, …)
357.451
musical
works
16k components
4k derived works
193.412
concerts and
studio recordings
469.131
performed work
3.833
foreseen
concerts
31.296
publications
48.006
scores
35
Future Work ● More interlinking with MusicBrainz
● Internal interlinking of performances
● Create bridges with other communities
(musicologists, streaming services, …)
Applications
● Explorative Search Engine
● KG-Based Recommender
System
http://overture.doremus.org/
DOREMUS CHATBOT
https://chatbot.doremus.org/
GitHub page
converters, interlinking tool, data dumps, ...
github.com/DOREMUS-ANR/
OVERTURE
discover DOREMUS data
overture.doremus.org
DOREMUS website
www.doremus.org
CHATBOT
q&a system for classical music
chatbot.doremus.org
THIS PRESENTATION
https://goo.gl/1UmKnVpasquale.lisena@eurecom.fr
@pasqLisena
37
Persons
9.269 euterpe
1.503 diabolo
9.040 itema3
8.419 philharmonie
19.881 bnf
54.675 bnf bib
291.421 in the whole graph
89.872 active*
* with 1 or more compositions, performances, dedications, ...
1.479 dedicatees
529 subjects
21.626 composers
7.830 conductors
3.583 performers
13.242 text authors
38
Corporate Bodies
45.743 in the whole graph
18.075 active*
* with 1 or more compositions, performances, dedications, ...
1001 euterpe
0 diabolo
39 itema3
1.603 philharmonie
855 bnf
14.657 bnf bib
6 dedicatees
7 subjects
517
orchestras +
ensembles
192 choruses
6.099 publishers
2.194 producers
39
Works
f15 f14 f22
- 10.587 10.587 euterpe
9.343 12.344 12.344 diabolo
-- 15.016 15.016 itema3
5.762 14.527 14.875 philharmonie
135.749 134.973 134.973 bnf
245.069 223.357 279.641 bnf bib
420.733 expressions
(include movements)
357.451 complex works
40
Works
16.132 components*
4.619 arrangements
293 transcriptions
43 orchestration
4.884 total of derivations
* movements, parts, acts, selections (extraits) ...
420.733 expressions
(include movements)
357.451 complex works
41
Performances
193.065 concerts (performances)
5.702 converted from
specific records
469.131 interpretations of
288.298 distinct works
f31 m43
2.294 2.294 diabolo
2.296 12.602 itema3
7107 47.119 philharmonie
14.115 15.221 bnf
165.225 387.519 bnf bib
42
Foreseen Concerts
3.833 concerts
13.520 interpretations of
10.759 distinct works
m26 f25 > f22
3.833 13.520 euterpe
17 artistic seasons
281 cycles
33 festivals
43
Recordings
397.597 recordings
15.267 supports
f26 f4 f3
2.296 2.842 - itema3
3.406 11.681 - philharmonie
392.020 744 199.339 bnf bib
198.693 publications
44
Scores
31.296 publications
48.006 scores
44.668 distinct works
f24 f24 > f22
31.296 48.006 bnf bib

Más contenido relacionado

Último

6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 

Último (20)

6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 

Destacado

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Destacado (20)

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 

DOREMUS - a Graph of Interlinked Musical Work

  • 1. DOREMUSa Graph of Interlinked Musical Work Pasquale Lisena EURECOM, France @pasqLisena M. Achichi, P. Lisena, K. Todorov, R. Troncy, J. Delahousse
  • 2. 2 Which works have been composed by Mozart when he was <10? How many works have been composed and performed for the 1st time in the same city? Which composers had the chance to direct their own work in a performance during the last decade?
  • 3. 3 metadata about artists, works, performances, scores Music knowledge graph used for building the knowledge graph open-source, reusable Tools for converting and interlinking
  • 5. 5 M. Lasar (2011). Digging into Pandora’s Music Genome with musicologist Nolan Gasser. https://arstechnica.com/tech-policy/2011/01/digging-into-pandoras-music-genome-with-musicologist-nolan-gasser/ When it comes to classical music, on the other hand, it's much more about the composition itself, because even though the interpretation can vary in various subtle ways. CLASSICALPOP VS For pop music the experience of the music is really defined by the recording.
  • 6. 6 CLASSICALPOP VS Track-based Work-based 60 years of history Thousand years from Gregorian chant to a work written last Tuesday Songs Multi-movement works Major, minor Polyphonic, homophonic, monophonic
  • 7. 7
  • 8. 8 Music archives have very detailed knowledge PROBLEMS ● Multiple formats ● No possible interoperability ● Need for discovering overlapping knowledge ● Information codified as free text ● Not always publicly accessible APPROACH Semantic Web!
  • 9. 9 Improve music description to foster music exchange and reuse Travel to the heart of the musical archives in France’s greatest institutions Connect sources, multiply usage, enrich user experience
  • 10. 10 Building the DOREMUS graph DATA CONVERSION DATA LINKING LINK VALIDATION DATA MODELING marc2rdf string2vocabulary ...custom converters legato
  • 11. DATA CONVERSION DATA LINKING LINK VALIDATION 11 The DOREMUS Model - Music specific extension of FRBRoo - Dynamic: it is made up of autonomous combined modules - Relies on Linked Data principles (everything is an URI, RDF model) FRBR museum information bibliographic records DATA MODELING Choffé, Pierre, and Françoise Leresche. DOREMUS: connecting sources, enriching catalogues and user experience. In 24th IFLA World Library and Information Congress. 2016.
  • 12. 12 The building blocks Work-Expression-Event F14 Work F22 Expression F28 Expression Creation R3 is realized in R17 created R19 created a realization of DATA CONVERSION DATA LINKING LINK VALIDATIONDATA MODELING
  • 13. 13 F14 Work F22 Expression M2 Opus Statement F28 Expression Creation R3 is realized in E7 Activity 5 1 “Sonate pour violoncelle et piano no 1”@fr “Sonates" , "Sonata in F" Ludwig van Beethoven Ludwig von Beethoven composer compositeur@fr compositore@it R17 created R19createda realizationof U17 has opus statement U12 has genre P102 has title U31 had function of type P14 carried out by P9 consists of P4 has time span1796 Sonata sonata@it , sonate@fr , klaviersonate@de M42 Performed Expression Creation M43 Performed Expression Berlin P4 has time span 1796 P7 took place at F24 Publication Expression F30 Publication Event P4 has time span 1797 P7 took place at Vienna U4 had princeps publication U54 is performed expression of P165 incorporates 1770 1827 P98 born P100 died U11 has key F Major F Dur@de , Fa majeur@fr, Fa maggiore@it , Fa mayor@es M6 Casting M23 Casting Detail U13 has casting 1 U30 quantity U2 foresees mop Piano Pianoforte@it Fortepian@pl M23 Casting Detail 1 U30 quantity U2 foresees mop Cello Violoncello@it Violoncelle@fr F15 Complex Work F19 Publication Work M44 Performed Work U5 had premiere U38 has descriptive expression R10 has member
  • 15. Controlled Vocabularies for Music Metadata GENRES Diabolo IAML Itema3 Redomi RAMEAU Medium of performance MIMO Itema3 IAML Diabolo RAMEAU Redomi Musical keys Modes Catalogues Derivation types Functions more available at http://data.doremus.org/vocabularies 23 families of vocabularies · 11,000+ concepts · 610 links between terms published at ISMIR 2018 INTERLINKED INTERLINKED
  • 16. 16 Dealing with different formats Works: INTERMARC Scores: INTERMARC Discs: INTERMARC Works: UNIMARC Scores: INTERMARC Performances: XML Works - Recordings - Scores 3 different XML sources A pre-digital archive format in Radio France DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
  • 17. Source datasets 17 Works 62 550 | XML Scores 9 154 | XML Concerts 340 609 | XML Discs 9 500 | XML Works 6 846 | UNIMARC Scores 30 319 | UNIMARC Concerts 5 164 | XML Discs 8 602 | XML Works 135 940 | INTERMARC Scores 89 184 | INTERMARC
  • 18. Source datasets 18 DATASET Works Scores Concerts Discs Classic work Jazz improvisation Ethnic/World/Traditional music
  • 19. 19 001 FRBNF139081882FR 100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827 144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur 001 FRBNF139081882FR 100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827 144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur LANG TITLE MOP OPUS KEY MARC FILE MARC must die http://lj.libraryjournal.com/2002/10/ljarchives/marc-must-die “ Roy Tennant, 2002 ” DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
  • 20. 20 marc2rdf MARC PARSER ● Parsing of the file ● Interpretation of the fields ● Graph generation MARC files mapping rules DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
  • 21. 21 144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur F22 Expression: Opus Number F22 Self-Contained Expression U17 has opus statement M2 Opus Statement [U42 has opus number M12 Opus Number] + [U43 has opus subnumber M13 Opus Subnumber] TUM : 144 $p, chain of digits TUM : 144 $p, chain of digits before the comma Remove the abbreviation “Op.” before the number 144 $pOp. 352 --> M12 = 352 144 $pOp. 27, no 2 --> M12 = 27, M13 =2 UNIT OF INFORMATION PATH INTERMARC BNF TRANSFER RULE EXAMPLE MAPPING RULES DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
  • 22. 22 marc2rdf http://data.doremus.org/performance/8abb8e71-1593-36b9-a998-80b437258ef4 MARC PARSER FREE TEXT INTERPRETER MARC files vocabularies 1st performance in Moscow, December 29, 1956, by Mstislav Rostropovich on cello and A. Dedukhin on piano “ ” ● Extracting info from the text through empirical rules ● Disambiguation for vocabularies terms and artists DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
  • 23. 23 marc2rdf MARC PARSER FREE TEXT INTERPRETER STRING 2 VOCABULARY ● Replace labels with URIs from controlled vocabularies MARC files vocabularies “Violoncelle”@fr <http://www.mimo-db.eu/InstrumentsKeywords/3582> DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
  • 24. 24 STRING 2 VOCABULARY ● Match against a family of vocabularies “Soprano”@it MIMO IAML DIABOLO ITEMA3 REDOMI RAMEAU GENRE “C Major”@en GENRE vocabulary:key/c KEY vocabulary:key/c https://github.com/DOREMUS-ANR/string2vocabulary ● 2 passes ○ Exact label + language ○ Exact label, any language ● Correction of editorial mistakes DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
  • 25. 25 INTERMARC marc2rdf UNIMARC EUTERPE XML ITEMA3 XML euterpe converter itema3 converter GRAPH BNF GRAPH PHILHARMONIE GRAPH EUTERPE GRAPH ITEMA3 diabolo converter DIABOLO XML GRAPH DIABOLO DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION STRING 2 VOCABULARY
  • 26. 26 GRAPH BNF GRAPH PHILHARMONIE http://data.doremus.org/expression/d72301f0-0aba- 3ba6-93e5-c4efbee9c6ea “Quasi una fantasia” COMPOSER Beethoven ORDER NUM 14 OPUS 27, n 2 GENRE sonata CASTING piano KEY C sharp major 1st PUB ? PREMIERE ? http://data.doremus.org/expression/37932fbc-fef3-3edb- 9fae-1eec9b4be01d “Sonata quasi una fantasia” COMPOSER Beethoven ORDER NUM 14 OPUS 27, n 2 GENRE sonata, romantic music CASTING piano (1) KEY C sharp major 1st PUB 1802, Vienna PREMIERE ? sameAs
  • 27. 27 DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING Challenges ● Not all the works have values for all the properties lack of attributes ● Similar values do not necessarily imply a match i.e. Beethoven’s Sonata n. 1, Sonata n. 2, Sonata n. 3 ● Lexical, semantic, transliteration, orthographic mismatches On the left: Beethoven. On the right: (the same) Beethoven.
  • 28. 28 DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING First Linking Composer + Catalogue Wolfgang Amadeus Mozart Eine kleine Nachtmusik K 525 Wolfgang Amadeus Mozart Serenade No. 13 in G major KV 525 sameAs
  • 29. 29 DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING Legato New linking system Existing data linking system were not satisfactory
  • 30. 30 DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING * works to be compared are grouped by composer *
  • 31. 31 DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING
  • 32. 32 DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING Heterogeneities Task False Positive Trap Legato performances at the OAEI campaign 2017 sandbox mainbox SPIMBENCHDOREMUS
  • 33. 33 DATA LINKINGDATA MODELING DATA CONVERSION LINK VALIDATION certain links confidence score + experts’ validation ? SINGLE LINK TRIANGLE MISSING LINK CONFLICT inference if experts’ validation remove with experts’ check
  • 34. 34 What is in the Knowledge Graph? 89.872 persons (composers, performers, …) 18.075 corporate bodies (orchestras, chorus, publishers, …) 357.451 musical works 16k components 4k derived works 193.412 concerts and studio recordings 469.131 performed work 3.833 foreseen concerts 31.296 publications 48.006 scores
  • 35. 35 Future Work ● More interlinking with MusicBrainz ● Internal interlinking of performances ● Create bridges with other communities (musicologists, streaming services, …) Applications ● Explorative Search Engine ● KG-Based Recommender System http://overture.doremus.org/ DOREMUS CHATBOT https://chatbot.doremus.org/
  • 36. GitHub page converters, interlinking tool, data dumps, ... github.com/DOREMUS-ANR/ OVERTURE discover DOREMUS data overture.doremus.org DOREMUS website www.doremus.org CHATBOT q&a system for classical music chatbot.doremus.org THIS PRESENTATION https://goo.gl/1UmKnVpasquale.lisena@eurecom.fr @pasqLisena
  • 37. 37 Persons 9.269 euterpe 1.503 diabolo 9.040 itema3 8.419 philharmonie 19.881 bnf 54.675 bnf bib 291.421 in the whole graph 89.872 active* * with 1 or more compositions, performances, dedications, ... 1.479 dedicatees 529 subjects 21.626 composers 7.830 conductors 3.583 performers 13.242 text authors
  • 38. 38 Corporate Bodies 45.743 in the whole graph 18.075 active* * with 1 or more compositions, performances, dedications, ... 1001 euterpe 0 diabolo 39 itema3 1.603 philharmonie 855 bnf 14.657 bnf bib 6 dedicatees 7 subjects 517 orchestras + ensembles 192 choruses 6.099 publishers 2.194 producers
  • 39. 39 Works f15 f14 f22 - 10.587 10.587 euterpe 9.343 12.344 12.344 diabolo -- 15.016 15.016 itema3 5.762 14.527 14.875 philharmonie 135.749 134.973 134.973 bnf 245.069 223.357 279.641 bnf bib 420.733 expressions (include movements) 357.451 complex works
  • 40. 40 Works 16.132 components* 4.619 arrangements 293 transcriptions 43 orchestration 4.884 total of derivations * movements, parts, acts, selections (extraits) ... 420.733 expressions (include movements) 357.451 complex works
  • 41. 41 Performances 193.065 concerts (performances) 5.702 converted from specific records 469.131 interpretations of 288.298 distinct works f31 m43 2.294 2.294 diabolo 2.296 12.602 itema3 7107 47.119 philharmonie 14.115 15.221 bnf 165.225 387.519 bnf bib
  • 42. 42 Foreseen Concerts 3.833 concerts 13.520 interpretations of 10.759 distinct works m26 f25 > f22 3.833 13.520 euterpe 17 artistic seasons 281 cycles 33 festivals
  • 43. 43 Recordings 397.597 recordings 15.267 supports f26 f4 f3 2.296 2.842 - itema3 3.406 11.681 - philharmonie 392.020 744 199.339 bnf bib 198.693 publications
  • 44. 44 Scores 31.296 publications 48.006 scores 44.668 distinct works f24 f24 > f22 31.296 48.006 bnf bib