1. Music and Machine Learning
Using Machine Learning for the Classification of
Indian Music: Experiments and Prospects
Paritosh K. Pandya
School of Technology and Computer Science
Tata Institute of Fundamental Research
email: pandya@tifr.res.in
http://www.tcs.tifr.res.in/∼pandya
SNDT 2006 – p.
2. Outline
Motivation (Example)
Introduction to Machine Learning
Intelligent Music Processing Examples
Indian Music: Some questions
Automatic Raag Recognition System
SNDT 2006 – p.
3. Computers and Information Processing
Evolution of computers
Scientific calculations: e.g. planetary orbits
Data processing: e.g. inventory
Multimedia Rich Text, Graphics, Pictures, Animations,
Video, Sound and Music.
Computers can store, edit, process and display all of
these!!
Internet and World-wide Web:
SNDT 2006 – p.
5. Computers and Arts
Computers and networks are increasingly used for
storing, processing (editing, cataloguing, searching),
and disseminating artistic content.
Web Portals with artistic, educational and
research-oriented content are becoming available e.g.
complete works of Shakespeare
Computers can be used to analyse artistic content in
new and sophisticated manners.
Computer as a tool for research in humanities and arts:
Example Discovery channel news (2003): Computers to
reveal Shakespeare
SNDT 2006 – p.
6. Learning Machines
Traditionally, computers are calculating devices. How to
calculate must be fully pre-programmed.
People observe patterns in nature, they discover rules
and they learn.
Can Computers learn? A question addressed by
artificial intelligence.
Learning System A system capable of the autonomous
acquisition and integration of knowledge.
SNDT 2006 – p.
7. How do systems learn?
Supervised Learning: Learn from examples.
Training Example Set (annotated)
Feature Selection (Input)
Target function representation
Statistical learning and classification
X
y = Number of
occurrences of Ma after
<Ga,MaTiv,Pa>
x = Number of oc-
currences of Re after
<Ga,MaTive,Pa>
y
SNDT 2006 – p.
8. Neural Nets
Learning functions
Given monthly icecream sales and average temperature for
last 10 years, predict icecream sales this summer.
SNDT 2006 – p.
9. Why use learning system?
The relationship between data elements is not
formalized. Only examples are available.
Relationship between data items is buried within large
amount of data.
Data mining: using historical data to discover relationships
and using this to improve future performance.
SNDT 2006 – p.
10. Applications of Machine Learning
Speech recognition
Image recognition (Face recognition)
Identifying Genes
Predicting Drug Activity
Cataloguing Faint Objects in Astronomical Data
Detecting Credit Card Frauds
Predicting Medical Outcomes from Historic Data
Detecting Hacking and Intrusion from Network Load
Computational Linguistics
SNDT 2006 – p. 1
11. Music Performance Visualisation
Performance worm [Widmer, Vienna]
Different players have different ways of building tension
or expression in the music
Measure subtle changes in beat level tempo versus
loudness for each note played.
Represent this visually in tempo-loudness space as a
trajectory called "performance worm".
SNDT 2006 – p. 1
13. Recognition of Concert Pianists
Characterisation of Personal Expression Features
Classification
Classification between 22 piano players
Classification based on performance worm like data
Achieved accuracy comparable with human listeners.
[Saunders et al (2005)]
SNDT 2006 – p. 1
14. Islands of Music
Intelligent structuring and exploration of digital music
collections [Pampalk et al (2004)]
Grouping of Music by
Similarity
Genre and Style
Performer
Timbral and rythmic con-
tent
Automatic classification of music by Genre: Classical,
Country, Disco, HipHop, Jazz and Rock [Pye 2000]
About 90% success on 176 songs
SNDT 2006 – p. 1
15. Music Structure Analysis
Structure in Music Composition
repetition, transposition, call and response, rythmic
patterns and harmoic sequences
shape of a song e.g. AABA
Automatic structure analysis attempts to discover such
structure [Danenberg,CMU]
Beat tracking and Tempo Detection
Identifying time signatures and tempo
Marking beat positions within music [Simon Dixon]
SNDT 2006 – p. 1
18. Computers and Music Notation
MIDI files: computer representation of musical score.
Can be recorded from keyboards etc.
Synthesizers: MIDI → Sound
Issue Expressive Music Representation
(a hot topic for research!)
Music Notation for Indian Music
Bhatkhande or Paluskar systems
Not used by professional musicians
Lacks structures
SNDT 2006 – p. 1
20. Swarupa: Structured Music
define kaida2n as
[
A::[ 2:dhatita::[dha,te,te], dhadha::[dha,dha]
tite::[te, te], dha,ge | tin,na,ke,na ];
B::[ A.1{khali} | C::[A.2 | A.3{bhari}] ]
];
define palta1 as
[ 3:[A.1 |] C; 3:[B.1 |] C; ];
define palta2 as
[ A.1 | D::4:5%[te,te] ; B ;
B.1 | D ; B ];
Synthesis Swarupa → Audio (See MuM Webpage)
Music transcription: Audio → Score
SNDT 2006 – p. 2
21. Indian Music Research using AI
Some topics
Classification of Music
Raag Recognition
Classification of Music in Thaats and Jatis
Classification of Raags into Time Cycle, Seasonal
Cycles,
Classification of music by gharanas
Performer recognition
Identification of Raag Lakshans
Association of Bhaav with musical performance
[B.Chaitanya Deva, 1981]
Beat tracking and taal recognition
Identification of musical structure
SNDT 2006 – p. 2
22. Associated Applications
Music Visualisation
Musical query processing from large annotated musical
databases.
Automatic music composition
Automatic accompaniment
Pursuit: Distance Education of Indian Music
SNDT 2006 – p. 2
23. Machine Recognition of Raags
Raga performance as sequence of notes.
Stop Sa Re Ga Pa Ga Re Sa Stop Dha Sa Re Ga
Sequential pattern classification problem
Data is not unordered set of samples.
Data elements occur in an order: spatial or temporal.
The probability of next data element crucially depends
on the order of occurrence of preceeding elements.
Hidden Markov Models (HMM) are widely used.
SNDT 2006 – p. 2
24. Finite state automaton for Raag
7
Stop System can be in one of
finite number of states
2
Ga Current state depends on
the past and current input
3 seen
Re
Current state and next in-
5 put determines the possible
Sa
next states
6
Dha Experiments: Manual con-
struction of raag automata
4 based on Bhatkhande Books
Pa
Bhoopali
[Sahasrabuddhe]
SNDT 2006 – p. 2
25. Hidden Markov Model of a Raag
Probability of
7
Stop 1.00 seeing a note in
0.46 the given state.
2
Ga 1.00
0.09 0.05 Probability of
0.45 0.51 0.25 moving from one
0.06
3
0.06 state to another
Re 1.00
0.41 0.38 0.18 An HMM model can
0.23 0.45
5
0.07 be learnt from train-
Sa 0.99
ing data
0.32 0.30
6
0.07
Dha 1.00 Analysis Given an
0.62 0.41 HMM and a note se-
4 quence, compute its
Pa 1.00
Bhoopali probability of occur-
rence.
SNDT 2006 – p. 2
26. Raag Recognition using HMM
Hidden Markov Model for a raag
Finite state automata
Probability of “seeing a note” in each state.
Probability of transition between states.
HMM model can be learnt from a set of training data
Given a note seqeuence we can compute its probability
within given Raag HMM model.
SNDT 2006 – p. 2
27. Kansen: A raga recognition system
An Experiment at TIFR using a Toolkit HTK:
Learns HMM for each raag from training data
(Baum-Welch Algorithm)
Training data: (Bhatkhande, IITK) collection of midi files
of raags played on keyboard. We use 29 raag database.
Test data: sequence of notes
Output: probability of the sequence being in each raag.
Preliminary Results
About 86 percent success on 29 raag recognition
Confusion between close raags
Insufficiency of dat a significant reason
(Joint work with Bhaumik Choksi and K. Samudravijaya)
SNDT 2006 – p. 2
28. Bhatkhande
MIDI File Database of Indian Raags (IIT, Kanpur)
Adana, AheerBhairav, AlhiyaBilawal, Bageshri,
Bahar, Basant, BasantMukhari, Behag, Bhoopali,
BhoopaliTodi, ChandraKauns, ChhayaNut, Des,
Durga, Gaud, Hamir, JataShwari, JaunaPuri Jogiya,
Lalit, Malkauns, MiyankiMalhar, Multani, Pahadi,
Peelu Sohini, TilakKamode, Tilang, Todi
Each midi file created by playing from Keyboard (e.g.
Des.mid)
Basic database of 29 Raags (above)
Full database of 300+ Raags
SNDT 2006 – p. 2
29. Demonstration
Input Stop Sa Sa DhaKo Sa GaKo Re Stop Sa Ga Ga Ma
GaKo Re GaKo Re Sa Ni DhaKo Ni Sa Re Sa Ni Sa
Output Log of probability of being in a raag
I=1 t=0.02 W=ChandraKauns v=1
I=2 t=0.02 W=ChhayaNut v=1
I=3 t=0.02 W=Hamir v=1
I=4 t=0.02 W=Pahadi v=1
I=5 t=0.02 W=MiyankiMalhar v=1
I=6 t=0.02 W=Adana v=1
I=8 t=0.02 W=Peelu v=1
J=0 S=0 E=1 a=-164.11 l=0.000
J=1 S=0 E=2 a=-163.60 l=0.000
J=2 S=0 E=3 a=-160.92 l=0.000
J=3 S=0 E=4 a=-160.72 l=0.000
J=4 S=0 E=5 a=-158.36 l=0.000
J=5 S=0 E=6 a=-117.88 l=0.000
J=13 S=0 E=8 a=-88.55 l=0.000
Summary Peelu (-88) Adana (-117) MiyankiMalhar (-158)
SNDT 2006 – p. 2
30. Demonstration (cont)
Stop NiKo Dha Ni Sa NiKo Pa Ma Pa GaKo Ma Re Sa
Bahar (-20.5) MiyankiMalhar (-23) Adana (-53)
Stop Ma Pa NiKo Dha Ni Ni Sa Stop Ni Sa Re GaKo
GaKo Ma Re Sa
MiyankiMalhar (-52) Bahar (-66) Adana (-75)
Stop Ma Pa Ni Ni Sa Ni Sa Sa Stop Pa Ni Sa Re NiKo
Dha Pa Stop Pa Dha Ma Ga Re Stop Ga Re Ni Sa
Des (-127) Gaud (-139) MiyankiMalhar (-144)
Stop Ga Ma DhaKo DhaKo Pa Stop Ma Pa GaKo Ma
ReKo Sa Stop Ga Ma Pa DhaKo Ni Sa DhaKo Pa
Basant Mukhari (-105) Peelu (-106) Jogiya (-130)
SNDT 2006 – p. 3
31. Demonstration (cont)
Stop Ni Sa GaKo ReKo Sa Stop Ni Sa GaKo MaTiv Pa
Stop GaKo MaTiv Pa Ni Sa DhaKo Pa MaTiv GaKo
ReKo Sa
Multani (-76) Todi (-105) ChandraKauns (-169)
Stop DhaKo Ni Sa ReKo GaKo Stop ReKo GaKo ReKo
Sa Stop Sa ReKo GaKo MaTiv ReKo GaKo ReKo Sa
Todi (-58) Bhoopali Todi (-83) Multani (-124)
SNDT 2006 – p. 3
32. Conclusions
Computer analysis and machine learning provides
interesting new method of analysing music. It allows
many intuitive and qualitative observations to be made
objective, precise and quantitative.
Research with computational techniques lead to direct
applications in music technology.
Intelligent music analysis is almost untried for Indian
Music.
Work requires collaboration between musicologists,
computer scientists and electrical engineers.
Music researchers must help by building corpuses and
annotated datasets for future machine analysis.
SNDT 2006 – p. 3