SlideShare una empresa de Scribd logo
1 de 14
Descargar para leer sin conexión
Feedback:	@enridaga	-	enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/
Finding listening experiences in books
Enrico Daga and Enrico Motta
The Open University (UK)
DARIAH EU Annual Event
Warsaw, 16th May 2019
Feedback:	@enridaga	-	enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/
The LED Project
• An open and freely searchable database that brings together a mass of data
about people’s experiences of listening to music of all kinds, in any historical
period and any culture [1].
• Sophisticated data model, natively in RDF / SPARQL
• Linked Open Data: http://data.open.ac.uk/context/led [2]
• Since 2012, the LED project has collected over 10,000 unique listening
experiences from a variety of textual sources
https://led.kmi.open.ac.uk/
Feedback:	@enridaga	-	enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/
Listening experiences
• What is a listening experience?
• An account of an event involving music and one or more participants
• "Introduced to the Anacreontic Society, consisting of amateurs who perform admirably the
best orchestral works. The usual supper followed. After propitiating me with a trio from
'Cosi Fan Tutte', they drew me to the piano.”
• "The best choir-singing, (Roman Catholic) without accompaniment, we have heard, was at
Munich."
• "Holland is the country of bells; and the merry chimes are to be heard hourly, from almost
every church-tower or steeple."
• All three constitute a report of an experience of a core subject: music.
Feedback:	@enridaga	-	enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/
Problem statement
• Acquiring evidence from texts requires effort, expertise, and it is time
consuming
• The activity is exploratory, or based on an a-priori knowledge of the source
• The overall process is not systematic
Feedback:	@enridaga	-	enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/
LE Database includes text excerpts that can be analysed as positive examples.
Project Gutenberg, >50k english books in the public domain
Reuters-21578 (Reu) is a standard corpus adopted extensively for training and
evaluating systems for information retrieval, document classi︎cation, machine
learning and similar corpus-based research [5]. Includes 21.578 news articles of
various categories. It does not include music.
The UK Reading Experience Database (UK RED) investigates the evidence of
reading in Britain [6].
DBpedia is a large knowledge graph published as Linked Data. Includes SPARQL
endpoint and a NER tool: DBpedia Spotlight [7]
Background Knowledge (BK)
Feedback:	@enridaga	-	enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/
Competing methods?
Forest. A typical Machine Learning workflow. We chose a Random Forest
Classifier [8] trained with LE, Reuters, and RED
Statistical (TF-IDF). Project Gutenberg has a Music shelf. We computed an
average TF/IDF to obtain a dictionary that we used to estimate the
relatedness of a text to the music domain.
Statistical (Embeddings). Using Word2Vec [9] to generate a dictionary of
terms related to music , threshold trained on LE, Reuters, and RED
Entities. Find DBpedia entities related to the category Music using DBpedia
Spotlight + a SPARQL query.
Feedback:	@enridaga	-	enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/
Forest, a typical machine learning classifier
LED
Database
LEs in Benchmark
LEs not in
Benchmark
Not LEs
(Reuters,
REs)
Negatives
Positives
Training
Test
Training
Test
Training
Test
Train
Classifier
Features
(NLP)
RF
Classifier
Features
(NLP)
Accuracy
?
?
?
?
?
?
?
Features
(NLP)
play[V],3149,5362
hear[V],2620,3598
music[N],2541,3650
time[N],2019,2644
first[J],2017,2738
come[V],1867,2389
sing[V],1783,2725
make[V],1759,2157
great[J],1727,2219
concert[N],
1705,2467
give[V],1647,2038
take[V],1403,1716
performance[N],
1353,1703
good[J],1323,1652
well[R],1305,1591
know[V],1178,1489
never[R],1142,1388
year[N],1129,1372
[…]
Bookmark
Bookmark
Bookmark
Bookmark
RF
Classifier
Feedback:	@enridaga	-	enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/
Statistical: TF-IDF / Embeddings
?
?
?
?
?
?
?
Bookmark
Bookmark
Bookmark
Bookmark
>
Threshold
T = [t1, t2, . . . , tn]
D = [d1 : s1, d2 : s2, . . . dn : sn] d s
R =
dn2 TX
dn2 D,dn! sn
s T
Beethoven[N]
vocal[J]
music[N]
Liszt[N]
Chopin[N]
composer[N]
Mozart[N]
musical[J]
Haydn[N]
piano[N]
aria[N]
fugue[N]
theme[N]
accent[N]
master[N]
Dickens[N]
resonance-chamber[N]
leading-tone[N]
florid[J]
sound[V]
score[N]
rondo[N]
sweet[J]
sense[N]
gesture[N]
hammer[N]
music[n]
melody[n]
musical[j]
singer[n]
choral[j]
musical[n]
tune[n]
song[n]
singing[n]
flute[n]
violin[n]
improvisation[n]
orchestral[j]
cello[n]
orchestra[n]
serenade[n]
cadenza[n]
melodious[j]
accompaniment[n]
symphony[n]
lute[n]
harp[n]
playing[n]
vocal[n]
lilt[n]
orchestration[n]
Feedback:	@enridaga	-	enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/
Entities
• DBPedia Spotlight to identify %entities%
• SPARQL query to DBpedia to filter the ones related to category:Music
?
?
?
?
?
?
?
DBpedia Spotlight
SELECT distinct ?sub WHERE {
VALUES ?sub { %entities% }
?sub dct:subject ?subject .
?subject skos:broader{0:%d%} cat:Music
}
• Where %entities% are the resources identified by the NER engine, and %d% is a
parameter, set to 5 (>5 too much noise).
Bookmark
Bookmark
Bookmark
Bookmark
Feedback:	@enridaga	-	enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/
Experiments
• Gold Standard (GS): 500 LEs + 500 not, from the same books
• GS (1) accurate by inter-rater agreement and (2) pessimistic
• Goal: reduce the bookmarks to supervise, with good ones all in!
• As an IR task (F1); as a (binary) classification task (Accuracy)
Feedback:	@enridaga	-	enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/
Tool support
https://led.kmi.open.ac.uk/discovery
• Books have different musical density, therefore the handle
to adjust the sensitivity
• We are cropping the text quite brutally, streaming the text
may get more accurate results (but it is more expensive)
Feedback:	@enridaga	-	enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/
Lessons learnt
Some results are very good: ~85% F-Measure & Accuracy, comparable to human annotators
However, different methods capture different aspects:
TRUE: Introduced to the Anacreontic Society, consisting of amateurs who perform admirably
the best orchestral works. The usual supper followed. After propitiating me with a trio from
'Cosi Fan Tutte', they drew me to the piano.
TRUE: In the evening we went to Rev. Baptist Noel's chapel, where one is always sure of
edification from the sermon if not from the psalms.
FALSE: Flags and pendants were suspended from the windows, [...] the colours of the German
States were waving harmoniously together, and the banners of the Fine Arts, with appropriate
inscriptions, particularly those of music, poetry and painting, were especially honored, and
︎oated triumphant amidst the standards of electorates, dukedoms, and kingdoms.
Feedback:	@enridaga	-	enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/
Future work
• Statistical approaches (incl. embeddings) inherit biases specific to
the core concept (music): inspirating[j], heartful[j], …
• Mentions of named entities related to music do not guarantee a
record of an experience of listening
• Hybrid method: (a) to integrate statistical analysis with entity
based “boost”, and (b) correct linguistic use bias (e.g. filter specific
POS)
Feedback:	@enridaga	-	enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/
References
• [1] Barlow, Helen and Rowland, David (Ed.) (2017). Listening to music: people, practices and experiences. The Open University.
• [2] Adamou, A., M. d’Aquin, H. Barlow, and S. Brown (2014). LED: curated and crowd-sourced linked data on music listening
experiences. Proceedings of the ISWC 2014 Posters & Demonstrations Track, 93–96.
• [3] Finkelstein, Lev, et al. "Placing search in context: The concept revisited." ACM Transactions on information systems 20.1
(2002): 116-131.
• [4] Giunchiglia, Fausto, Uladzimir Kharkevich, and Ilya Zaihrayeu. "Concept search." European Semantic Web Conference.
Springer, Berlin, Heidelberg, 2009.
• [5] Lewis, D. D. (1997). Reuters-21578 text categorization collection.
• [6] Halsey, K. (2008). Reading the evidence of reading: An introduction to the reading experience database, 1450-1945.Popular
Narrative Media 1(2).
• [7] Mendes, P. N., Jakob, M., García-Silva, A., & Bizer, C. (2011, September). DBpedia spotlight: shedding light on the web of
documents. In Proceedings of the 7th international conference on semantic systems (pp. 1-8). ACM.
• [8] Ho, Tin Kam. (1995) "Random decision forests." Document analysis and recognition, 1995., proceedings of the third
international conference on. Vol. 1. IEEE, 1995.
• [9] Mikolov, Tomas, et al. (2013) "Distributed representations of words and phrases and their compositionality." Advances in
neural information processing systems. 2013.

Más contenido relacionado

Más de Enrico Daga

Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchEnrico Daga
 
Capturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachCapturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachEnrico Daga
 
Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...Enrico Daga
 
OU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data ClusterOU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data ClusterEnrico Daga
 
CityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesCityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesEnrico Daga
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User StudyEnrico Daga
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so farEnrico Daga
 
Propagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsPropagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsEnrico Daga
 
A bottom up approach for licences classification and selection
A bottom up approach for licences classification and selectionA bottom up approach for licences classification and selection
A bottom up approach for licences classification and selectionEnrico Daga
 
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsA BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsEnrico Daga
 
Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEnrico Daga
 

Más de Enrico Daga (12)

Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities research
 
Capturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachCapturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid Approach
 
Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...
 
Ld4 dh tutorial
Ld4 dh tutorialLd4 dh tutorial
Ld4 dh tutorial
 
OU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data ClusterOU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data Cluster
 
CityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesCityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tables
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User Study
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so far
 
Propagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsPropagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data Flows
 
A bottom up approach for licences classification and selection
A bottom up approach for licences classification and selectionA bottom up approach for licences classification and selection
A bottom up approach for licences classification and selection
 
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsA BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
 
Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data Cubes
 

Último

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 

Último (20)

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 

Finding Listening Experiences in Books

  • 1. Feedback: @enridaga - enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/ Finding listening experiences in books Enrico Daga and Enrico Motta The Open University (UK) DARIAH EU Annual Event Warsaw, 16th May 2019
  • 2. Feedback: @enridaga - enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/ The LED Project • An open and freely searchable database that brings together a mass of data about people’s experiences of listening to music of all kinds, in any historical period and any culture [1]. • Sophisticated data model, natively in RDF / SPARQL • Linked Open Data: http://data.open.ac.uk/context/led [2] • Since 2012, the LED project has collected over 10,000 unique listening experiences from a variety of textual sources https://led.kmi.open.ac.uk/
  • 3. Feedback: @enridaga - enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/ Listening experiences • What is a listening experience? • An account of an event involving music and one or more participants • "Introduced to the Anacreontic Society, consisting of amateurs who perform admirably the best orchestral works. The usual supper followed. After propitiating me with a trio from 'Cosi Fan Tutte', they drew me to the piano.” • "The best choir-singing, (Roman Catholic) without accompaniment, we have heard, was at Munich." • "Holland is the country of bells; and the merry chimes are to be heard hourly, from almost every church-tower or steeple." • All three constitute a report of an experience of a core subject: music.
  • 4. Feedback: @enridaga - enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/ Problem statement • Acquiring evidence from texts requires effort, expertise, and it is time consuming • The activity is exploratory, or based on an a-priori knowledge of the source • The overall process is not systematic
  • 5. Feedback: @enridaga - enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/ LE Database includes text excerpts that can be analysed as positive examples. Project Gutenberg, >50k english books in the public domain Reuters-21578 (Reu) is a standard corpus adopted extensively for training and evaluating systems for information retrieval, document classi︎cation, machine learning and similar corpus-based research [5]. Includes 21.578 news articles of various categories. It does not include music. The UK Reading Experience Database (UK RED) investigates the evidence of reading in Britain [6]. DBpedia is a large knowledge graph published as Linked Data. Includes SPARQL endpoint and a NER tool: DBpedia Spotlight [7] Background Knowledge (BK)
  • 6. Feedback: @enridaga - enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/ Competing methods? Forest. A typical Machine Learning workflow. We chose a Random Forest Classifier [8] trained with LE, Reuters, and RED Statistical (TF-IDF). Project Gutenberg has a Music shelf. We computed an average TF/IDF to obtain a dictionary that we used to estimate the relatedness of a text to the music domain. Statistical (Embeddings). Using Word2Vec [9] to generate a dictionary of terms related to music , threshold trained on LE, Reuters, and RED Entities. Find DBpedia entities related to the category Music using DBpedia Spotlight + a SPARQL query.
  • 7. Feedback: @enridaga - enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/ Forest, a typical machine learning classifier LED Database LEs in Benchmark LEs not in Benchmark Not LEs (Reuters, REs) Negatives Positives Training Test Training Test Training Test Train Classifier Features (NLP) RF Classifier Features (NLP) Accuracy ? ? ? ? ? ? ? Features (NLP) play[V],3149,5362 hear[V],2620,3598 music[N],2541,3650 time[N],2019,2644 first[J],2017,2738 come[V],1867,2389 sing[V],1783,2725 make[V],1759,2157 great[J],1727,2219 concert[N], 1705,2467 give[V],1647,2038 take[V],1403,1716 performance[N], 1353,1703 good[J],1323,1652 well[R],1305,1591 know[V],1178,1489 never[R],1142,1388 year[N],1129,1372 […] Bookmark Bookmark Bookmark Bookmark RF Classifier
  • 8. Feedback: @enridaga - enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/ Statistical: TF-IDF / Embeddings ? ? ? ? ? ? ? Bookmark Bookmark Bookmark Bookmark > Threshold T = [t1, t2, . . . , tn] D = [d1 : s1, d2 : s2, . . . dn : sn] d s R = dn2 TX dn2 D,dn! sn s T Beethoven[N] vocal[J] music[N] Liszt[N] Chopin[N] composer[N] Mozart[N] musical[J] Haydn[N] piano[N] aria[N] fugue[N] theme[N] accent[N] master[N] Dickens[N] resonance-chamber[N] leading-tone[N] florid[J] sound[V] score[N] rondo[N] sweet[J] sense[N] gesture[N] hammer[N] music[n] melody[n] musical[j] singer[n] choral[j] musical[n] tune[n] song[n] singing[n] flute[n] violin[n] improvisation[n] orchestral[j] cello[n] orchestra[n] serenade[n] cadenza[n] melodious[j] accompaniment[n] symphony[n] lute[n] harp[n] playing[n] vocal[n] lilt[n] orchestration[n]
  • 9. Feedback: @enridaga - enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/ Entities • DBPedia Spotlight to identify %entities% • SPARQL query to DBpedia to filter the ones related to category:Music ? ? ? ? ? ? ? DBpedia Spotlight SELECT distinct ?sub WHERE { VALUES ?sub { %entities% } ?sub dct:subject ?subject . ?subject skos:broader{0:%d%} cat:Music } • Where %entities% are the resources identified by the NER engine, and %d% is a parameter, set to 5 (>5 too much noise). Bookmark Bookmark Bookmark Bookmark
  • 10. Feedback: @enridaga - enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/ Experiments • Gold Standard (GS): 500 LEs + 500 not, from the same books • GS (1) accurate by inter-rater agreement and (2) pessimistic • Goal: reduce the bookmarks to supervise, with good ones all in! • As an IR task (F1); as a (binary) classification task (Accuracy)
  • 11. Feedback: @enridaga - enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/ Tool support https://led.kmi.open.ac.uk/discovery • Books have different musical density, therefore the handle to adjust the sensitivity • We are cropping the text quite brutally, streaming the text may get more accurate results (but it is more expensive)
  • 12. Feedback: @enridaga - enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/ Lessons learnt Some results are very good: ~85% F-Measure & Accuracy, comparable to human annotators However, different methods capture different aspects: TRUE: Introduced to the Anacreontic Society, consisting of amateurs who perform admirably the best orchestral works. The usual supper followed. After propitiating me with a trio from 'Cosi Fan Tutte', they drew me to the piano. TRUE: In the evening we went to Rev. Baptist Noel's chapel, where one is always sure of edification from the sermon if not from the psalms. FALSE: Flags and pendants were suspended from the windows, [...] the colours of the German States were waving harmoniously together, and the banners of the Fine Arts, with appropriate inscriptions, particularly those of music, poetry and painting, were especially honored, and ︎oated triumphant amidst the standards of electorates, dukedoms, and kingdoms.
  • 13. Feedback: @enridaga - enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/ Future work • Statistical approaches (incl. embeddings) inherit biases specific to the core concept (music): inspirating[j], heartful[j], … • Mentions of named entities related to music do not guarantee a record of an experience of listening • Hybrid method: (a) to integrate statistical analysis with entity based “boost”, and (b) correct linguistic use bias (e.g. filter specific POS)
  • 14. Feedback: @enridaga - enrico.daga@open.ac.ukhttp://led.kmi.open.ac.uk/ References • [1] Barlow, Helen and Rowland, David (Ed.) (2017). Listening to music: people, practices and experiences. The Open University. • [2] Adamou, A., M. d’Aquin, H. Barlow, and S. Brown (2014). LED: curated and crowd-sourced linked data on music listening experiences. Proceedings of the ISWC 2014 Posters & Demonstrations Track, 93–96. • [3] Finkelstein, Lev, et al. "Placing search in context: The concept revisited." ACM Transactions on information systems 20.1 (2002): 116-131. • [4] Giunchiglia, Fausto, Uladzimir Kharkevich, and Ilya Zaihrayeu. "Concept search." European Semantic Web Conference. Springer, Berlin, Heidelberg, 2009. • [5] Lewis, D. D. (1997). Reuters-21578 text categorization collection. • [6] Halsey, K. (2008). Reading the evidence of reading: An introduction to the reading experience database, 1450-1945.Popular Narrative Media 1(2). • [7] Mendes, P. N., Jakob, M., García-Silva, A., & Bizer, C. (2011, September). DBpedia spotlight: shedding light on the web of documents. In Proceedings of the 7th international conference on semantic systems (pp. 1-8). ACM. • [8] Ho, Tin Kam. (1995) "Random decision forests." Document analysis and recognition, 1995., proceedings of the third international conference on. Vol. 1. IEEE, 1995. • [9] Mikolov, Tomas, et al. (2013) "Distributed representations of words and phrases and their compositionality." Advances in neural information processing systems. 2013.