5. Definition
Music Information Retrieval
Is the process of searching for, and finding, music objects, or
part of objects, via a query framed musically and/or in musical
terms.
Music objects: Recordings (wav, mp3, etc.), scores, parts, etc.
Musically framed query: Singing, humming, keyboad, notation-based,
MIDI files, sound files, etc.
Music terms: Genre, style, tempo, bibliography, etc.
Applications
Music search, recommendation, identification ,etc.
5
7. Applications -- Search
Text-based Music Search
Compare a textual query with the metadata
Is adopted by most existing systems.
Examples: Last.fm, Musicovery, …
7
8. Applications -- Search
Content-based Music Search
Compare audio query with audio content
Query-by-humming/singing/recording: midomi
8
9. Applications -- Search
Content-based Music Search
Compare rhythm tapped with audio content
Query-by-tapping: SongTapper
9
10. Applications -- Search
User’s information need Online Offline
(intention) : Query
Explicit Query: text, audio, etc. Intention Gap
Query
Formation
Similarity measure: Descriptors Documents
Query Music documents in the Match
Descriptor
Extraction
Database Indexing
Semantic Gap Index Descriptors
Ranking: relevant documents by
domain specific criterions (no. of Ranking
hits ). Ranked List
Presentation
Results
10
12. Applications – Recommendation
Collaborative-Filtering-based Recommendation
Last.fm: what you (and others ) listen to and like,
Amazon: customers who shopped for … also shopped for …
12
13. Applications – Recommendation
Collaborative-Filtering-based Recommendation
Last.fm: what you (and others ) listen to and like,
Amazon: customers who shopped for … also shopped for …
Example:
Users: A, B, and C
Music: 1, 2, …, 8
Small similarity Large similarity
C A B
Similarity
4
Similarity 1 Measure
1
Measure
2 2
6 3 3
8 4 5
Recommend to user A
13
14. Applications -- Recommendation
Audio Content-based Recommendation
Recommend songs which have similar audio content to the
songs that you like.
Pandora:
Music database Music Experts
User
Listen
Instrument:
Instrument: Similarity Instrument:
Vocal:
Vocal: Vocal:
Structure:
Measure
Structure: …Structure:
… …
400 Attributes/song
Recommendations
14
15. Applications -- Recommendation
User’s information need Online / Offline Offline
(intention): User Profile
Implicit user profiles: ratings, Intention Gap
Profile
Capture
listening history, etc.
Descriptors Documents
Similarity measure:
Descriptor
User profiles Music Match Extraction
documents/other user profiles Semantic Gap Index
Indexing
Descriptors
Ranking: relevant documents Ranking
by some domain specific
Ranked List
criterions (no. of hits).
Presentation
Results
15
16. Similarity Measure
one of the most fundamental concepts in MIR
Online / Offline Offline
Closely related to
User Profile
/query
What information music Profile/query
Intention Gap
Capture
contains.
How this information is
Descriptors Documents
represented. Match
Descriptor
Extraction
How to match between themSemantic Gap
Indexing
Index Descriptors
Ranking
Ranked List
Presentation
Results
16
17. Music Information Plane
Similarity can be measure
from different aspects.
Song1: New favorite -
Alison Krauss
Song2: She is Beautiful -
Andrew W.K.
Song1 Song2
Female Male
Dissimilar Gentle Aggressive
Slow fast
Guitar
Similar Tempo: ~162 BPM
(Beat Per Minute)
* O. C. Herrada. Music recommendation and discovery in
Music the long tail. PhD thsis. 2008. 17
19. Similarity Measure Methods
Text-based Method: Okapi BM-25 Ranking
Given: queryQ, containing keywords q1, …, qn, music documents: bag of words.
BM25 ranking function can be formulated as:
f(qi, D) is qi’s term frequency (tf) in document D.
|D| is the length of document D in words.
avgdl is the average document length in the collection.
k1 and b are free parameters, usually set as k1=2.0 and b=0.75.
IDF(qi) is the inverse document frequency (idf), calculated as:
. The query term appears in this document frequently. f (qi, D)
. And it doesn’t appear in other document. IDF
19
20. Similarity Measure Methods
Text-based Method:
Pros
Simple & efficient
Cons
Affected by noisy/wrong texts
Songs with no text cannot be retrieved
Require high-level domain knowledge to create good metadata
“Text retrieval on audio metadata” not pure music retrieval
20
24. Existing Works
Audio Feature-based Method
Audio Feature extraction Distribution modeling Model Comparison
Use low-level feature directly
Pitch, loudness, MFCC (Blum et al.[3], 1999)
Histogram of MFCC (Foote[4], 1997)
Spectrum, rhythm, chord changesingleVector (Tzanetakis [5], 2002)
Low-level features higher-level features.
Cluster MFCC=>model comparison (Aucouturier[6], 2002)
MFCC => Gaussian Mixture Models => model comparison
MFCC =>“anchor space”, compare probability models (Berenzweig et al.
[7], 2003)
24
25. Similarity Measure Methods
Audio Feature-based Method
Audio Feature extraction Distribution modeling Model Comparison
Euclidean /Cosine distance (uniform-length feature vectors)
Distance between two probability distributions
Kullback-Leibler divergence (KL Distance) / relative entropy
No closed form for Gaussian Distributions
Centroid distance: Euclidean distance between the overall means;
Sampling based method: compute the likelihood of one model given points
sampled from another; very computationally expensive;
Earth-Mover’s distance
Berenzweig, A., Logan, B., Ellis, D. P., and Whitman, B. P., A Large-Scale Evaluation of Acoustic and Subjective Music-Similarity
Measures. Computer Music Journal. 28, 2004, 25
26. Similarity Measure Methods
Audio Feature-based Method
Pros
Can deal with new songs with no or few texts.
Save human labors from annotating each song manually
Cons
Time complexity is relatively high.
Features ≠ audio piece: Two songs with very similar features may sounds
very different.
The average performance is reaching the glass ceiling of around 65% in
accuracy.
26
28. Similarity Measure Methods
Semantic Concept-based Method
Nature of user queries
Far beyond of bibliographic text and audio search
Semantically-rich
Syntactically- undetermined
e.g.: “Find me a classical and happy song”, or “Find me a song to relax”
“Find me some songs for parties/ weddings/ in churches” …
Collaborative(social) tagging is very popular on Web 2.0.
Users annotate their feelings or opinions to the music. Tags,
comments, etc.
28
29. Similarity Measure Methods
Semantic Concept-based Method
Tags VS user queries (Last.fm)
Tag Type Frequency Multi-tag search queries
Genre 68% 51%
Locale 12% 7%
Mood 5% 4%
Opinion 4% 2%
Instrumentation 4% 5%
Style 3% 26%
. Paul Lamere. Social tagging and music information
retrieval. Journal of New Music Research. 2008.
. Klaas Bosteels, Elias Pampalk, and Etienne E. Kerre. Music
retrieval based on social tags: a case study. ISMIR, 2008. 29
30. Similarity Measure Methods
Vocabulary: classical, jazz, … piano, violin, …, female, male, …
…
Model Model … Model
… Probability vector
Song1 Similarity
…
Song2
30
32. Similarity Measure Methods
Multimodal Method
Information keeps growing.
One of the most important ongoing trends:
Metadata
Audio Semantic
Content Concept
Users are important.
32
33. Similarity Measure Methods
Multimodal Method
Document Vectors
Customization
Fuzzy Music Semantic
Vector
B. Zhang, J. Shen, Q. Xiang, and Y. Wang. CompositMap: a novel framework for music similarity measure. ACM Multimedia,
2009. 33
35. Conclusion and Future Directions
What makes MIR (and the similarity measure) so
tricky?
Music information is
Multimodal: audio, metadata, social , …
Multicultural: e.g., modern art, Indian ragas, …
Multirepresentational: audio, MIDI, score, …
Multifaceted: melody, tempo, beat, …
…
Similarity can be measured from different aspects.
35
36. Conclusions and Future Directions
What do users really want?
Intention Gap
User interactions with the system.
Learn a good user preference modeling
What kind of music features can really capture this need?
Content –Tags Semantic Gap
Leverage more social data? Comments, ratings, groups, playlist, other
user created information, …
How to fuse multiple information effectively?
Identify the relevant/discriminative information aspects
Fusion Methods
36
38. References
[2] F[1] O. C. Herrada. Music recommendation and discovery in the long
tail. PhD thsis. 2008.
. Pachet. Knowledge management and musical metadata. Encyclopedia of
Knowledge Management. Idea Group, 2005.
[3] T. L. Blum, D. F. Keislar, J. A. Wheaton, and E. H. Wold. Method and article
of manufacture for content-based analysis, storage, retrieval, and
segmentation of audio information. U.S. Patent 5, 918, 223.
[4] J. T. Foote. Content-based retrieval of music and audio. SPIE, 1997.
[5] G. Tzanetakis. Manipulation, analysis, and retrieval system for audio
signals. PhD thsis, 2002.
[6] J. J. Aucouturier and F. Pachet. Music similarity measure: What’s the use?
International Symposium on Music information retrieval. 2002.
[7] A. Berenzweig, D. P. W. Ellis and S. Lawrence. Anchor space for
classification and similarity measurement for music. ICME 2003.
38
39. References
[8] B. Zhang, J. Shen, Q. Xiang and Y. Wang. CompositeMap: a
novel framework for music similarity measure. SIGIR, 2009.
[9] B. Whiteman and S. Lawrence. Inferring descriptions and
similarity for music from community metadata. International
computer music conference. 2002.
[10] M. Schedl, T. Pohle, P. Knees and G. Widmer. Assigning
and visualizing music genre by web-based co-occurrence
analysis. ISMIR 2006.
[11] B. Whitman and Paris Smaragdis. Combining musical and
cultural features for intelligent style detection. ISMIR 2002.
[12] L. Chen, P. Wright, and W. Nejdl. Improving music genre
classification using collaborative tagging data. WSDM, 2009.
[13] Benedikt Raes. Automatic generation of music metadata.
ISMIR, 2009.
39