SlideShare una empresa de Scribd logo
1 de 10
CS6998: Computational Approaches to Emotional Speech, Fall 2009


               EMOTIONAL ANALYSIS OF CARNATIC MUSIC:
                  A MACHINE LEARNING APPROACH.
                  Ravi Kiran Holur Vijay                   Arthi Ramachandran

                  Abstract                                                1.1 Outline
Carnatic music or South Indian Classical               We will start off by saying a few words about
Music is a traditional form of music, which            the type of music we would be working with:
originated from and was popularized by the             Carnatic music. We will then move on to
South Indian states. Our goal was to create a          describe the problem we are trying to address
framework for performing emotion based                 and some of the underlying assumptions.
Music Information Retrieval of Carnatic                Next, we will take a look at some past
music. As part of this, we chose to explore a          research in the field of emotional analysis of
couple of the problems which are part of the           music and explore how we can adapt this for
larger framework. In the first problem, we try         our analysis. After this, we will start
to predict the magnitude of the various                investigating one of the core problems -
emotional dimensions the song might induce             precisely representing the emotions associated
on the listener. We use an expert system               with a song. We do this by introducing the
approach, wherein we assume that the                   novel concept of emotion dimensions. With
emotions identified by an expert listening to          the results of cluster analysis, we see that the
the song would be the same as those                    emotion dimension-based approach has some
experienced by the actual listener. Thus, the          validity in the context of Carnatic music.
expert's annotation is considered to be the            Later, we examine, in detail, the process of
gold standard. The subsequent problem would            preparing the corpus for annotation by experts
be to cluster emotionally similar songs using          and also re-examine the results of annotation
various Clustering techniques available. By            with emotion dimensions, including the inter-
analyzing these clusters, we can suggest songs         rater agreements. This stage poses a challenge
with similar emotional profile.                        in itself due to the need to communicate with
                                                       experts, who are Artists, about our
              1. Introduction                          requirements in a precise manner. Also, we
Carnatic music, South Indian Classical Music           had to make the process as simple as possible
is a very structured style of music. Every piece       for the Experts to annotate the corpus. Once
is associated with a certain raaga or melody           the corpus is annotated, we then move on to
which is, in turn, associated with a set of            training classifiers for predicting the values
emotions. However, many pieces are                     for each of the emotion dimensions. Here, we
associated with multiple emotions rather a             will explore segmentation of songs and also
single emotion. In this paper, we are                  extracting     various    features    for    the
evaluating the use of 10 dimensions to                 classification task. We analyze the relative
represent the emotions in Carnatic music.              importance of each feature set for predicting
Music similarity is a problem explored in              the emotional dimensions. We then explore
other genres of music with applications such           ways to cluster songs based on their emotional
as music database searching. Here, we are              similarity using Clustering techniques.
using emotional features to evaluate similarity        Finally, we conclude by looking at the results
in songs.                                              achieved and commenting on the possible
                                                       improvements in future.


                                                   1
Emotional Analysis of Carnatic Music: A Machine Learning Approach

                                                      uses samples from a custom-created corpus.
             1.2. Related Research                    Here the samples are selected keeping the
There have been various attempts for                  Raaga constant and varying the type of music
exploring ways to predict the emotions                like Sitar, Sarod, Vocal (male and female).
induced by music using computational
techniques.                                                        1.3. Carnatic Music
Yang, Lin, Su and Chen (2008) as well as              Carnatic music is a form of classical music
Han, Rho, Dannenberg and Hwang (2009)                 prevalent in the Southern part of India. It has
explore regression-based approaches for               mythological roots and its history dates back
predicting emotions induced by music. They            to ancient times, with it evolving over the
consider 12 and 11 emotions respectively,             centuries. The form of music is very
derived from Juslin’s theory theory and are           structured and stylized with precise
distributed over a 2 dimensional space that           combinations of notes allowed in each
uses Thayer’s emotion model. Each instance            composition.
corresponds to a point in this 2-dimensional
space, the coordinates of which are predicted                    1.3.1. Raagas and Rasas
using regression approaches. Han et al use            One of the fundamental concepts of Carnatic
music samples from an online songs database.          music is that of a Raaga. Every composition is
The samples are labeled according to a                set to a particular raaga. The raaga of a song
taxonomy used by the database and these are           characterizes its melody. Each piece in a
later transformed into labels for the 11              certain raaga uses a certain set of notes. There
emotions they used. They pick 15 songs for            are also rules which govern how the notes
each of the 11 emotions they considered.              interact with each other and which ones are
Trohidis, Tsoumakas, Kalliris and Vlahavas            prominent in the piece. Another element of a
(2008) and Wieczorkowska, Synak and Ras               raaga     is    the     rasa     or    emotion.
(2006) explore multi-label based classification
approaches for predicting emotions induced            Carnatic music theory tells that there are nine
by music. Trohidis et al. (2008) consider 6           rasas or emotions conveyed by the music
emotions which are derived from Tellegan-             style. Certain raagas are associated with
Watson-Clark model of mood. Each instance             particular rasas. For example, Hamsadhwani is
can correspond to more than one emotion in            associated with joy while Kaanada is
this approach. They use music samples from a          associated               with              love
custom-created corpus. Here the samples are           (http://www.karnatik.com/rasas.shtml). Often,
selected based on different genres. Experts           bhakthi, or devotion, is considered another
were asked to annotate the corpus with                rasa. The nine rasas, used in Carnatic Music
emotional labels in this case. Also, they             theory are shown in Table 1.
consider 30 sec segments of each sample for
the feature extraction task.                                  avarasa        Meaning
In his work, Chordia, (2007) tries to                          shringara     romance/beauty
empirically establish the relationship between
Raagas and emotions induced on the listeners                   hasya         comic/happiness
in Hindustani or North Indian Classical Music.                 karuna        pathetic/sad
He considers 12 discrete emotions, 4 of which                  rudra         angry
are among the traditional emotions used in                     vira          heroic
classical theory. Each instance can correspond                 bhayanaka     ridden by fear
to more than one of these 12 emotions. He                      bibhatsa      disgust
                                                               adbhuta       surprise/wonder
                                                  2
Emotional Analysis of Carnatic Music: A Machine Learning Approach

          shanta          peaceful                       evoke a multitude of emotions in the listener
   Table 1: Emotional components in Carnatic Music       and hence cannot be represented using a single
                                                         label. For example, the song “Enna Tavam” in
          2. Problem Definition                          raga Kapi, is a devotional song praising a
Our aim is to build a framework that                     Hindu God. Since it is also about a God who
accomplishes two tasks:                                  in the form of a child, there are strong notes of
• Predict the emotions induced on a listener             affection and love conveyed by the song.
  when the listener listens to the song.                 There is also a tinge of the melancholy of a
• Given a song that the listener likes, retrieve         mother whose son is growing up fast and
  a set of songs that might induce emotions              proving to be beyond her comprehension. In
  similar to that of the given song.                     such cases, it is hard to label the song as
We explore an expert system approach                     “happy” when there are many other emotions
wherein the knowledge obtained from experts              involved. In our quest for alternative
in the field of Carnatic music serves as the             representations, we came up with a novel way
ground truth for various tasks in our system,            to interpret the emotional state of a listener by
including the information about emotions                 using a mixture of the classical emotions
induced by a particular song.                            defined by Carnatic music theory.

In order to address the problem described                In this model, the emotional state experienced
above, there are several sub-problems to                 by a listener is represented using a mixture of
consider, the important ones being:                      the 10 Rasas, as defined by the Classical
• How do we represent and annotate the                   music theory. Thus, the emotional state of a
  emotions induced by a song?                            listener can be visualized as a point in a 10-
• How do we prepare a corpus of Carnatic                 dimensional Euclidian space, where each
  songs and get it annotated by experts? How             dimension corresponds to a rasa. To further
  do we design a method which makes it easy              illustrate the model, we could look at a few
  for the experts to annotate?                           examples below:
• What features from audio signal might be               • Happy(1) = Sringara (Romantic) + Shantha
  relevant for classifying the emotions                     (Peaceful)
  induced?                                                  A song in this state is romantic but either
• How do we cluster emotionally similar                     elements of soothing peace and calm in it.
  songs together? How good/pure are these                   One example from the Carnatic music
  clusters?                                                 repertoire is “Mohamahinaen” in raga
                                                            Karaharapriya.
    3. Representation of Emotions                        • Happy(2) = Hasya (Joy) + Bhakthi
For solving the problem at hand, it was crucial             (Devotion)
to come up with a system for representing                   “Vathapi Ganapathim” in Hamsadhwani is a
emotions conveyed by Carnatic music.                        popular song that is joyful and energetic.
Initially, we experimented with the two                     Yet it is devotional, praising Ganesha.
dimensional Thayer’s model and a multi-class,            • Excitement(1) = Bhakthi (Devotional) +
single-label      approach      to     emotion              Rudra (Angry) + Adhbhuta (Wonder) +
representation as used in Yang et al. (2008)                Veera (Powerful)
and Han et al. (2009). The results of this                  “Bho Shambho” in raaga Revathi is a song
experiment were not promising. One of the                   that is powerful and intensely devotional,
reasons for the failure might be due to the fact            praising the God of destruction and
that compositions in Carnatic music tend to                 portraying his omniscience. The power also

                                                     3
Emotional Analysis of Carnatic Music: A Machine Learning Approach

 translates into anger occasionally, showing           often slower exploration of the melody
 the dance of God and wondering at His                 leading into the main song. The experts were
 power.                                                asked to annotate each of these song segments
                                                       with ratings for each of the 10 Rasas.
Further, we represent the value for each                     Distribution of Parent Ragas
dimension as a rating between 0 (not present)                           in Corpus
and 3 (very strongly present), as obtained
from the expert. Thus, we can represent the                    Chitarambari
emotion conveyed by each song as a 10-                               Kalyani
dimensional vector, where each dimension can                   Dharmavathi
take on values between 0 and 3, inclusive. In                Gamanashrama
order to test the validity of our representation                 Pantuvarali
model, we performed a cluster analysis, the              Shubhapanthuvarali
results of which can be used to indicate that                         Varali
                                                                    Chalanta
the representation chosen was indeed a good
                                                          Shankarabharanam
representation of emotions in Carnatic music.                 Harikhambhoji
                                                              Karaharapriya
                 4. Corpus                                     Natabhairavi
In order to implement the machine learning                     Suryakantam
approach, we needed a corpus which is                      Mayamalavagowla
                                                                Natakapriya
annotated with ground truth. In our research
                                                                      Thodi
review, we failed to find any existing corpus
for Carnatic music and hence built our own                                      0     10      20     30
corpus. We started off by selecting a few
hundred songs from various Carnatic                            Number of Samples of Parent Raga
compositions sung by various artists. One of              Figure 1: Distribution of Parent Ragas in Corpus
the team member, who is well versed in the
theory of Carnatic Music, labeled each of the          The rating had to be chosen from {0(not
compositions with coarse-grained emotion               present), 1(weakly present), 2(present),
labels consisting of Happy, Sad, Peaceful and          3(strongly present)}. We then created a
Devotional. We then selected 109 songs,                Google Docs Spreadsheet which could be
distributed equally over these 4 coarse grained        used by the experts for annotation, as shown
classes      for     our      final     corpus.        in Figure 2.
Next, we had to get the data annotated by              The distribution of Raagas across various
experts. We extracted a 30 sec (after initial 30       samples in the corpus can be seen in (Figure
sec) segment from each song, uploaded it to a          1).
publicly accessible repository (esnips.com).
We ignored the first 30 seconds since it is




                                    Figure 2: Annotator Spreadsheet




                                                   4
Emotional Analysis of Carnatic Music: A Machine Learning Approach

       4.1. Agreement between labelers                    used in conjunction with classical dance to
To check the agreement between the two                    convey emotions. Table 3 shows the number
labelers, we looked at the Kappa statistic                of disagreements in valence for the rasas as
between the labelers. Cohen’s Kappa is                    seen in each parent raaga.
calculated as
                     Pr(a) − Pr(e)                        Melakartha                   umber of Disagreements of
                 κ=
                       1− Pr(e)




                                                                                                              Adhbhuta
                                                                                          Sringara
where Pr(a) is the observed percentage




                                                                             Bhakthi




                                                                                                     Karuna




                                                                                                                         Shanta
agreement and Pr(e) is the probability of
random agreement. In this case, since there are
4 possible ratings for any emotion, Pr(e) =               Thodi              3           3           1        1          4
                                                          Natakapriya        4           4           2        4          3
0.25. The following table shows the Kappa
                                                          Mayamalavagowl     4           2           1        3          1
statistic for each emotion.                               a
                                                          Suryakantam        1           2           1        1          0
Emotion       Meaning               Kappa Statistic       Natabhairavi       5           4           1        2          5
                                                          Karaharapriya      11          7           8        12         13
Bhakthi       Devotion              -0.1809               Harikhambhoji      11          1           11       12         9
Sringara      Love                  0.1365                Shankarabharana    13          4           18       20         20
Hasya         Comedy/laughter       0.9492                m
Raudra        Anger                 0.9873                Chalanta           2           0           2        3          0
Karuna        Sadness               0.3015                Varali             2           3           2        2          3
Bhibhatsa     Disgust               1                     Shubhapanthuvar    1           1           0        1          0
Bhayanaka     Fear                  0.9873                ali
Vira          Heroism               0.2888                Pantuvarali        1           2           1        2          2
                                    0.6063                Gamanashrama       2           1           0        1          2
Adhbhuta      Wonder
                                                          Dharmavathi        1           0           0        1          0
Shanta        Peace                 0.1365
        Table 2: Kappa Statistic for each Rasa
                                                          Kalyani            6           3           6        8          8
                                                          Chitarambari       0           0           1        1          0
The statistic is very high for Hasya, Raudra,             Total              67          37          55       74         70
Bhibhatsa and Bhayanaka. This is the case                         Table 3: Disagreements by parent raga
because those emotions are very rarely present
in Carnatic music and hence both labelers                       5. Experiments and Analysis
typically labeled the samples as 0. The                              5.1. Classification Task
remaining      emotions    show      significant          Once we had the annotated corpus, the next
disagreement. Each rater seems to perceive                step was to train classifiers for predicting the
different levels of the emotion in the samples.           ratings for each of the dimensions or rasas.
Our reasoning behind this is that different               We chose to treat each of the dimensions as
labelers have different perceptions of                    independent of each other and hence we had
emotions. For instance Sringara can refer to              to train different classifiers for each of the
romantic love but often also refers to parental           dimensions. Further, looking at the results of
love, friendship or beauty. As a result of                expert annotation, we decided to filter out
numerous interpretations, the consistency                 some dimensions since they were rated as
between raters decreases. Since only some of              absent for most of the instances. We trained
these emotions are used commonly in songs to              classifiers for four different dimensions:
convey emotions, we restricted to our                     Bhakthi, Sringara, Karuna, Shantha. The task
experiments to the following emotions:                    was a multi-class (4) classification problem
bhakthi, sringara, karuna, vira, adhbuta and              since each of the Rasas could have discrete
shanta. The remaining Rasas were usually

                                                      5
Emotional Analysis of Carnatic Music: A Machine Learning Approach

ratings     of      0,      1,      2,          or   3.       Features.             SVM               Ripper
                                                                                    Accuracy (%)      Accuracy (%)
                                                              Dynamics.             36.4              34.5
       5.1.1 Classification Methodology                       Rhythm – Attack       37.4              30
We used WEKA (Hall et al., 2009) for our                      Rhythm – Tempo        34.6              34.5
experiments. For each of the dimension, we                    Rhythm – All          33.6              34.5
experimented with SVM and Ripper                              Pitch                 43                29
classifiers. For choosing SVM model, we                       Timbre – MFCCs        37.4              35.5
varied the "C" parameter from 1x10-4 to                       (Mean, SD)
                                                              Timbre – Mel          37.3              40
1x104.
                                                              Spectrum
                                                              Timbre – Others       45.8              34
               5.1.2 Features Used                            (Mean, SD)
 We used MIRToolBox within Matlab                             Timbre    –    All    40                36.5
 environment (Lartillot et al., 2007) for                     (Mean, SD)
 extracting the features from Audio signals.                  Tonal (Mean, SD)      41                43
                                                              All                   37.4              40
 Keeping in line with previous research in this                            Table 5: Dimension – Sringara
 field, we experimented with the following set                            Baseline: 34.6% (Major Value = 2)
 of features.
• Dynamics – RMS Energy.                                      Features               SVM              Ripper
• Rhythm – Fluctuation, Tempo, Attack Time.                                          Accuracy (%)     Accuracy (%)
                                                              Dynamics               60               57
• Pitch – Peak, Centroid calculated from                      Rhythm – Attack        57.9             57.9
   Chromagram.                                                Rhythm – Tempo         57.9             55
• Timbre – Spectral, Centroid, Skewness,                      Rhythm – All           57.9             57
   Spread, Brightness, Flatness, Roughness,                   Pitch                  57.9             57
   Irregularity, MFCCs, Zero crossing, Spectral               Timbre – MFCCs         57.9             56
                                                              Timbre – Mel           57.9             53
   flux.
                                                              Spectrum
• Tonal – Key clarity, Key strengths (12).                    Timbre – All           57.9             50.5
                                                              Tonal (Mean, SD)       57.9             59
         5.1.3 Classification Results                         All                    57.9             51
The results reported below are the accuracy                                 Table 6: Dimension – Karuna
                                                                          Baseline: 57.9% (Major Value = 0)
value averages obtained using 10-fold cross
validation technique.                                         Features               SVM              Ripper
                                                                                     Accuracy (%)     Accuracy (%)
Features             SVM                 Ripper               Dynamics               42.9             36.4
                     Accuracy (%)        Accuracy (%)         Rhythm – Attack        42.9             37.4
Dynamics             71                  67                   Rhythm – Tempo         49.6             43
Rhythm – Attack      71                  68                   Rhythm – All           42               54.2
Rhythm – Tempo       71                  71                   Pitch (Mean, SD)       47               50.5
Rhythm – All         71                  71                   Timbre – MFCCs         43               38
Pitch                71                  68                   Timbre – Mel           43               44
Timbre – MFCCs                                                Spectrum (Mean,
                     71                  64
                                                              SD)
Timbre – Mel         71                  63                   Timbre – All           44               39
Spectrum
                                                              Tonal                  43               40
Timbre – All         70                  64.5                 All                    44               39
Tonal                71                  62                                 Table 7: Dimension – Shanta
All                  70                  60                               Baseline: 42.9% (Major Value = 1)
            Table 4 : Dimension – Bhakthi
           Baseline: 71% (Major Value = 3)



                                                          6
Emotional Analysis of Carnatic Music: A Machine Learning Approach

As we can see from the evaluation results, the          • The clustering gives us a soft or fuzzy
importance of each feature set and classifier             approach for representing the emotions
varies by the emotional dimension under                   conveyed by a song. Using these clusters,
consideration. The best feature sets for each of          we can retrieve songs that are emotionally
the emotional dimensions are highlighted in               similar to the given song. Without this
green. The best performing feature sets for               approach, emotional similarity would be
each of the dimensions are in logical                     restricted to single labels like Happy, Sad
agreement with those defined by the Carnatic              etc. Hence, given a happy song, we could
Music theory.                                             only retrieve other Happy songs. But with
                                                          this approach, we can locate emotionally
           5.1.4 Analysis of results                      similar songs in the 10-dimensional space
As we can see, the results are equal to or                using any similarity metrics like Cosine
better than baseline for all the emotional                similarity, Euclidian distance, Manhattan
dimensions considered. But for Bhakthi, the               Distance, among others.
results were not any better than the baseline.
Two of the important factors responsible for                   5.2.1 Clustering and Evaluation
the low accuracies might be:                            In order to evaluate the purity of each cluster,
• Sparse data in terms of number of annotated           we need a metric that tells us how emotionally
  samples that were available (109 in total).           similar the given songs are to each other. We
• Limitations of the song segmentation                  explored a qualitative approach for evaluating
  technique used during feature extraction.             the goodness or purity of the cluster. This
  This was basically 30 second segment after            approach consists of associating each song
  initial            30                seconds.         with the Melakartha Raaga it corresponds to.
                                                        Once we have this, we could obtain the
                 5.2. Clustering                        information about emotional similarity of
Once we have the values or ratings for each of          Raagas using Carnatic Music theory. Thus, by
the Rasas or dimensions, we can try clustering          analyzing the Raaga distribution within each
together songs based on their emotional                 cluster, we can qualitatively comment
similarity. The motivation for this arises from         regarding the cluster's purity or goodness.
our hypothesis that the emotional state of a
listener can be visualized as a point in 10-            For the clustering task, we represented each
dimensional Euclidean space. Therefore,                 input instance (song) as a 10-dimensional
emotionally similar songs would occur close             vector and we ran the EM algorithm in weka
to each other in this 10-dimensional space.             with       the      following       parameters:
The clustering task would basically help us             “weka.clusterers.EM -I 100 -N -1 -M 1.0E-6 -
with two things:                                        S 100”. We analyzed each person’s labels
• If the songs clustered together are indeed            separately because of low rater agreement. We
  similar with respect to emotions induced on           had initially tried to correlate the clusters to
  the listener (cluster purity is high), it would       the raga labels but due to lack of enough
  verify our hypothesis that the 10 dimensions          samples from many of the ragas (many had
  are indeed good for capturing the emotions            only 1 sample present in the corpus), we
  corresponding to the song.                            decided to use its parent raga.




                                                    7
Emotional Analysis of Carnatic Music: A Machine Learning Approach

Every raga in Carnatic music is either a parent                        structures. Those derived from the same raga
raga or a derived raga. Each derived raga is                           are more similar to each other that any two
derived from one of 72 main ragas, the parent                          random ragas.
ragas. Derived ragas share the same notes as
their parent raga as well as certain melodic

                                         Number of samples per cluster (Rater 1)
                        25
    Number of samples




                        20
                        15
                        10
                        5
                        0




                                Parent Raga                                    Count of Cluster 0   Count of Cluster 1


                                              Figure 2 - Raaga distribution across clusters
      5.2.2 Analysis of clustering results                             found in Cluster 0 tend to be associated with
We notice (Figure 3) that the distribution of                          sadness and peace. We can see Karaharapriya
parent raga instances for each cluster is                              is split between the two clusters. Several ragas
different. Since the cluster size is fixed to two,                     in the corpus were derived from
we can hypothesize that the two clusters might                         Karaharapriya. One such raga is Kannada,
logically correlate to roughly Happy and Sad                           which is associated with joyous emotions.
emotions. This might be possible only if the                           Others such as Abheri are much sadder and
10-dimensional representation chosen is
    dimensional                                                        melancholic. Hence we see mixed clusters in
indeed a good way to capture the emotions                              Karaharapriya.
conveyed by the song. Looking at the
distribution of the Raagas across clusters                             In figure 4, we can see how the different
(Figure 3), we can notice that the Raagas                              emotions contribute towards each cluster.
corresponding to Happy or Positive emotions                            Cluster 1 (in red) shows an absence of Karuna
tend to co-occur predominantly in the same
             occur                                                     and much higher levels of Sringara while
cluster, and those corresponding to the                                Cluster 0 (in blue) is sadder.
negative emotions tend to co             co-occur
predominantly in the other (different) cluster.

More specifically, Harikhambhoji an       and
Shankarabharanam have a large fraction of the
samples in Cluster 1. In general, those ragas
tend to be associated with more positive
emotions such as happiness and joy. The ragas

                                                                   8
Emotional Analysis of Carnatic Music: A Machine Learning Approach

                                                          music. We were also able to verify the
                                                          appropriateness of this method through the
                                                          cluster analysis procedure. Next, we tried
                                                                                        .
                                                          training classifiers to predict the values for
                                                          each of the emotional dimensions, the results
                                                          of which were better than the baseline.
                                                          Therefore, we have a framework that could be
                                                               efore,
                                                          used for:
                                                          • Predicting and locating songs according to
                                                                               ocating
                                                            the emotions they induce on the listener.
                                                          • Retrieving songs that are emotionally
                                                            similar to the given song.
                                                          We can attribute the relatively low
                                                          improvements over baseline in                  the
                                                          classification task to:
                                                          • Sparsity of annotated samples.
                                                          • Naive Segmentation technique used (30 sec
                                                            after initial 30 sec) during feature extraction
                                                          • Conflicts in Expert labeling (low Kappa
                                                            scores).

                                                                       6.1. Future Work
                                                          If we were to improve on our work, we could
                                                                                       ur
                                                          start by trying to obtain more annotated
                                                          samples from the Experts. The definition of
                                                          each of the dimension as a Rasa should be
                                                          made more explicit in order to decrease
                                                          conflicts in labeling between experts and
                                                          increase the Kappa score. We would also need
                                                           ncrease
                                                          to collect the annotation ratings from more
                                                          than one expert in order to ensure statistical
                                                          and logical consistency of the annotations.
                                                          Also, we could explore using a more fine
                                                          grained rating scale for each dimension and
                                                                                           dimens
                                                          see if it leads to improvement in clustering
                                                          and classification accuracies. We need to also
                                                          work on better segmentation strategies for
                                                          extracting features. For example, we could try
    Figure 3: Distribution of Clusters across Rasas
            :                                             considering the initial 30 sec, middle 30 sec
                                                          and the final 30 sec segments of the given
We can thus qualitatively argue that the
                                                          song, rather than just one 30 sec segment.
clusters induced are indeed good/pure.

                6. Conclusion                                      6.2. Acknowledgements
                                                          We would like to take this opportunity to
In our work, we explored a novel method for
                                                          express our heartfelt gratitude to everyone
representing emotions conveyed by Carnatic
                                                          who helped us in our work. Specifically, we

                                                      9
Emotional Analysis of Carnatic Music: A Machine Learning Approach

would like to thank the following people for           music into emotions. In: Proceedings of the
their invaluable suggestions and contributions:        9th International Conference on Music
• Sapthagiri Iyengar, has been practicing              Information      Retrieval    (ISMIR).(2008).
  Carnatic music for more than a decade now.
  He has also performed in various Carnatic            5. A. Wieczorkowska, P. Synak, and Z.W.
  music concerts. We consulted him for                 Ras. Multi-label classification of emotions in
  clarifications related to the Carnatic theory        music. In Proceedings of the 2006
  and he was also one of the Experts who               International Conference on Intelligent
  volunteered to annotate the corpus. Many of          Information Processing and Web Mining
  his invaluable suggestions have been                 (IIPWM’06), pages 307–315, 2006.
  incorporated into our present work.
• Meena Ramachandran is a connoisseur of               6. Chordia, P. and Rae, A. 2008.
  Carnatic music. She was one of the experts           Understanding Emotion in Raag: An
  who volunteered to annotate the corpus.              Empirical Study of Listener Responses. In
• Prof. Julia Hirschberg, Bob Coyne, Fadi              Computer Music Modeling and Retrieval.
  Biadsy and all other members of the Speech           Sense    of   Sounds:   4th  international
  Lab and CS6998 course for their invaluable           Symposium, CMMR 2007, Copenhagen,
  suggestions.                                         Denmark, August 27-31, 2007.

                                                       7. How does a raga make you feel
               7. References                           (http://www.karnatik.com/rasas.shtml)
1. Y.-H. Yang, Y.-C. Lin, Y.-F. Su, and H.-H.
                                                       8. Lartillot, O. & Toiviainen, P. (2007). MIR
Chen, "A regression approach to music
                                                       in Matlab (II): A Toolbox for Musical Feature
emotion recognition," IEEE Transactions on
                                                       Extraction     From    Audio.     International
Audio, Speech and Language Processing
                                                       Conference on Music Information Retrieval,
(TASLP), vol. 16, no. 2, pp. 448-457, Feb.
                                                       Vienna, 2007.
2008.
                                                       9. Mark Hall, Eibe Frank, Geoffrey Holmes,
2. Byeong-jun Han, Seungmin Rho, Roger B.
                                                       Bernhard Pfahringer, Peter Reutemann, Ian H.
Dannenberg, and Eenjun Hwang, "SMERS:                  Witten (2009); The WEKA Data Mining
Music Emotion Recognition using Support                Software: An Update; SIGKDD Explorations,
Vector Regression", International Society              Volume 11, Issue 1.
for Music Information Retrieval (ISMIR'09),
Kobe, Japan, pp. 651-656, Oct. 26.
3. Arefin Huq, Juan Pablo Bello, Andy
Sarroff, Jeff Berger and Robert Rowe,
“Sourcetone: An Automated Music Emotion
Recognition System”, Poster presentation.
10th International Society for Music
Information Retrieval Conference, Kobe,
Japan.           October,          2009.

4. Trohidis, K., Tsoumakas, G., Kalliris, G.,
Vlahavas, I.: Multilabel classification of


                                                  10

Más contenido relacionado

La actualidad más candente (20)

Bipartite graph
Bipartite graphBipartite graph
Bipartite graph
 
Chap4
Chap4Chap4
Chap4
 
Coloring graphs
Coloring graphsColoring graphs
Coloring graphs
 
Chapter 25
Chapter 25Chapter 25
Chapter 25
 
NP completeness
NP completenessNP completeness
NP completeness
 
Master theorem
Master theoremMaster theorem
Master theorem
 
Transport layer protocols : Simple Protocol , Stop and Wait Protocol , Go-Bac...
Transport layer protocols : Simple Protocol , Stop and Wait Protocol , Go-Bac...Transport layer protocols : Simple Protocol , Stop and Wait Protocol , Go-Bac...
Transport layer protocols : Simple Protocol , Stop and Wait Protocol , Go-Bac...
 
Graph coloring
Graph coloringGraph coloring
Graph coloring
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
 
Activity selection problem
Activity selection problemActivity selection problem
Activity selection problem
 
Topic Packet switching
Topic Packet switchingTopic Packet switching
Topic Packet switching
 
Unit iv
Unit ivUnit iv
Unit iv
 
Polyalphabetic Substitution Cipher
Polyalphabetic Substitution CipherPolyalphabetic Substitution Cipher
Polyalphabetic Substitution Cipher
 
Chapter4 1
Chapter4 1Chapter4 1
Chapter4 1
 
Chapter 23
Chapter 23Chapter 23
Chapter 23
 
chapter 2.pptx
chapter 2.pptxchapter 2.pptx
chapter 2.pptx
 
Graph coloring problem
Graph coloring problemGraph coloring problem
Graph coloring problem
 
Sum of subset problem.pptx
Sum of subset problem.pptxSum of subset problem.pptx
Sum of subset problem.pptx
 
Security services and mechanisms
Security services and mechanismsSecurity services and mechanisms
Security services and mechanisms
 

Destacado

Kaiqi tablecloth
Kaiqi tableclothKaiqi tablecloth
Kaiqi tableclothElly Chen
 
Film Language: Mise en scene - costume, inc. hair & make-up
Film Language: Mise en scene - costume, inc. hair & make-upFilm Language: Mise en scene - costume, inc. hair & make-up
Film Language: Mise en scene - costume, inc. hair & make-upVictory Media
 
Codes and conventions of music videos finished
Codes and conventions of music videos finishedCodes and conventions of music videos finished
Codes and conventions of music videos finishedAa-bee ×lala
 
Film shots and their effect on the audience
Film shots and their effect on the audienceFilm shots and their effect on the audience
Film shots and their effect on the audienceSianLynes
 
Grade 8 Music and Arts Module
Grade 8 Music and Arts ModuleGrade 8 Music and Arts Module
Grade 8 Music and Arts ModuleAndrew Cabugason
 

Destacado (8)

Kaiqi tablecloth
Kaiqi tableclothKaiqi tablecloth
Kaiqi tablecloth
 
Film Language: Mise en scene - costume, inc. hair & make-up
Film Language: Mise en scene - costume, inc. hair & make-upFilm Language: Mise en scene - costume, inc. hair & make-up
Film Language: Mise en scene - costume, inc. hair & make-up
 
Audience
AudienceAudience
Audience
 
Music
MusicMusic
Music
 
Codes and conventions of music videos finished
Codes and conventions of music videos finishedCodes and conventions of music videos finished
Codes and conventions of music videos finished
 
Film shots and their effect on the audience
Film shots and their effect on the audienceFilm shots and their effect on the audience
Film shots and their effect on the audience
 
Grade 8 Music and Arts Module
Grade 8 Music and Arts ModuleGrade 8 Music and Arts Module
Grade 8 Music and Arts Module
 
module in grade 8 health
module in grade 8 healthmodule in grade 8 health
module in grade 8 health
 

Similar a Emotion Recognition in Classical Music

Music of india powerpoint
Music of india powerpointMusic of india powerpoint
Music of india powerpointX-tian Mike
 
Music of india powerpoint (2)
Music of india powerpoint (2)Music of india powerpoint (2)
Music of india powerpoint (2)Jaydee Dela Cruz
 
Mouawad
MouawadMouawad
Mouawadanesah
 
Music Organisation Using Colour Synaesthesia
Music Organisation Using Colour SynaesthesiaMusic Organisation Using Colour Synaesthesia
Music Organisation Using Colour Synaesthesiam.voong
 
Musicofindiapowerpoint 111213042514-phpapp01
Musicofindiapowerpoint 111213042514-phpapp01Musicofindiapowerpoint 111213042514-phpapp01
Musicofindiapowerpoint 111213042514-phpapp01Miyor de los Santos
 
Dhvani theory
Dhvani theoryDhvani theory
Dhvani theoryAnuja Raj
 
Womanhood in Indian Culture through Metaphors in the Literary Works of Saroji...
Womanhood in Indian Culture through Metaphors in the Literary Works of Saroji...Womanhood in Indian Culture through Metaphors in the Literary Works of Saroji...
Womanhood in Indian Culture through Metaphors in the Literary Works of Saroji...Atula Ahuja
 
Syllabus Section IV  Music Essay and AssignmentsThe Music E.docx
Syllabus Section IV  Music Essay and AssignmentsThe Music E.docxSyllabus Section IV  Music Essay and AssignmentsThe Music E.docx
Syllabus Section IV  Music Essay and AssignmentsThe Music E.docxmabelf3
 
Learning Vocabulary Through Music
Learning Vocabulary Through MusicLearning Vocabulary Through Music
Learning Vocabulary Through MusicTenec02
 
Catch-up Friday is a learning mechanism which aims to strengthen learners' ca...
Catch-up Friday is a learning mechanism which aims to strengthen learners' ca...Catch-up Friday is a learning mechanism which aims to strengthen learners' ca...
Catch-up Friday is a learning mechanism which aims to strengthen learners' ca...LieLanieNavarro
 
Rabindra Sangeet: A journey to whole or completeness!
Rabindra Sangeet: A journey to whole or completeness!Rabindra Sangeet: A journey to whole or completeness!
Rabindra Sangeet: A journey to whole or completeness!Ani Rahman
 
Rabindrasangeet, Psychotherapy, Consciousness, Integral psychology
Rabindrasangeet, Psychotherapy, Consciousness, Integral psychologyRabindrasangeet, Psychotherapy, Consciousness, Integral psychology
Rabindrasangeet, Psychotherapy, Consciousness, Integral psychologyD Dutta Roy
 

Similar a Emotion Recognition in Classical Music (20)

Rasa and the brain
Rasa and the brainRasa and the brain
Rasa and the brain
 
Music of india powerpoint
Music of india powerpointMusic of india powerpoint
Music of india powerpoint
 
Music of india powerpoint (2)
Music of india powerpoint (2)Music of india powerpoint (2)
Music of india powerpoint (2)
 
Mouawad
MouawadMouawad
Mouawad
 
Music Organisation Using Colour Synaesthesia
Music Organisation Using Colour SynaesthesiaMusic Organisation Using Colour Synaesthesia
Music Organisation Using Colour Synaesthesia
 
Natyashastra
NatyashastraNatyashastra
Natyashastra
 
Musicofindiapowerpoint 111213042514-phpapp01
Musicofindiapowerpoint 111213042514-phpapp01Musicofindiapowerpoint 111213042514-phpapp01
Musicofindiapowerpoint 111213042514-phpapp01
 
Dhvani theory
Dhvani theoryDhvani theory
Dhvani theory
 
Womanhood in Indian Culture through Metaphors in the Literary Works of Saroji...
Womanhood in Indian Culture through Metaphors in the Literary Works of Saroji...Womanhood in Indian Culture through Metaphors in the Literary Works of Saroji...
Womanhood in Indian Culture through Metaphors in the Literary Works of Saroji...
 
Theory of Rasa & Dhavani
Theory of Rasa & DhavaniTheory of Rasa & Dhavani
Theory of Rasa & Dhavani
 
E4102731.pdf
E4102731.pdfE4102731.pdf
E4102731.pdf
 
An Introduction to Indian Classical Music
An Introduction to Indian Classical MusicAn Introduction to Indian Classical Music
An Introduction to Indian Classical Music
 
Syllabus Section IV  Music Essay and AssignmentsThe Music E.docx
Syllabus Section IV  Music Essay and AssignmentsThe Music E.docxSyllabus Section IV  Music Essay and AssignmentsThe Music E.docx
Syllabus Section IV  Music Essay and AssignmentsThe Music E.docx
 
Learning Vocabulary Through Music
Learning Vocabulary Through MusicLearning Vocabulary Through Music
Learning Vocabulary Through Music
 
Catch-up Friday is a learning mechanism which aims to strengthen learners' ca...
Catch-up Friday is a learning mechanism which aims to strengthen learners' ca...Catch-up Friday is a learning mechanism which aims to strengthen learners' ca...
Catch-up Friday is a learning mechanism which aims to strengthen learners' ca...
 
Paper 07
Paper 07Paper 07
Paper 07
 
Rabindra Sangeet: A journey to whole or completeness!
Rabindra Sangeet: A journey to whole or completeness!Rabindra Sangeet: A journey to whole or completeness!
Rabindra Sangeet: A journey to whole or completeness!
 
Rabindrasangeet, Psychotherapy, Consciousness, Integral psychology
Rabindrasangeet, Psychotherapy, Consciousness, Integral psychologyRabindrasangeet, Psychotherapy, Consciousness, Integral psychology
Rabindrasangeet, Psychotherapy, Consciousness, Integral psychology
 
Rag desh
Rag deshRag desh
Rag desh
 
Ppt paper no7
Ppt paper no7Ppt paper no7
Ppt paper no7
 

Emotion Recognition in Classical Music

  • 1. CS6998: Computational Approaches to Emotional Speech, Fall 2009 EMOTIONAL ANALYSIS OF CARNATIC MUSIC: A MACHINE LEARNING APPROACH. Ravi Kiran Holur Vijay Arthi Ramachandran Abstract 1.1 Outline Carnatic music or South Indian Classical We will start off by saying a few words about Music is a traditional form of music, which the type of music we would be working with: originated from and was popularized by the Carnatic music. We will then move on to South Indian states. Our goal was to create a describe the problem we are trying to address framework for performing emotion based and some of the underlying assumptions. Music Information Retrieval of Carnatic Next, we will take a look at some past music. As part of this, we chose to explore a research in the field of emotional analysis of couple of the problems which are part of the music and explore how we can adapt this for larger framework. In the first problem, we try our analysis. After this, we will start to predict the magnitude of the various investigating one of the core problems - emotional dimensions the song might induce precisely representing the emotions associated on the listener. We use an expert system with a song. We do this by introducing the approach, wherein we assume that the novel concept of emotion dimensions. With emotions identified by an expert listening to the results of cluster analysis, we see that the the song would be the same as those emotion dimension-based approach has some experienced by the actual listener. Thus, the validity in the context of Carnatic music. expert's annotation is considered to be the Later, we examine, in detail, the process of gold standard. The subsequent problem would preparing the corpus for annotation by experts be to cluster emotionally similar songs using and also re-examine the results of annotation various Clustering techniques available. By with emotion dimensions, including the inter- analyzing these clusters, we can suggest songs rater agreements. This stage poses a challenge with similar emotional profile. in itself due to the need to communicate with experts, who are Artists, about our 1. Introduction requirements in a precise manner. Also, we Carnatic music, South Indian Classical Music had to make the process as simple as possible is a very structured style of music. Every piece for the Experts to annotate the corpus. Once is associated with a certain raaga or melody the corpus is annotated, we then move on to which is, in turn, associated with a set of training classifiers for predicting the values emotions. However, many pieces are for each of the emotion dimensions. Here, we associated with multiple emotions rather a will explore segmentation of songs and also single emotion. In this paper, we are extracting various features for the evaluating the use of 10 dimensions to classification task. We analyze the relative represent the emotions in Carnatic music. importance of each feature set for predicting Music similarity is a problem explored in the emotional dimensions. We then explore other genres of music with applications such ways to cluster songs based on their emotional as music database searching. Here, we are similarity using Clustering techniques. using emotional features to evaluate similarity Finally, we conclude by looking at the results in songs. achieved and commenting on the possible improvements in future. 1
  • 2. Emotional Analysis of Carnatic Music: A Machine Learning Approach uses samples from a custom-created corpus. 1.2. Related Research Here the samples are selected keeping the There have been various attempts for Raaga constant and varying the type of music exploring ways to predict the emotions like Sitar, Sarod, Vocal (male and female). induced by music using computational techniques. 1.3. Carnatic Music Yang, Lin, Su and Chen (2008) as well as Carnatic music is a form of classical music Han, Rho, Dannenberg and Hwang (2009) prevalent in the Southern part of India. It has explore regression-based approaches for mythological roots and its history dates back predicting emotions induced by music. They to ancient times, with it evolving over the consider 12 and 11 emotions respectively, centuries. The form of music is very derived from Juslin’s theory theory and are structured and stylized with precise distributed over a 2 dimensional space that combinations of notes allowed in each uses Thayer’s emotion model. Each instance composition. corresponds to a point in this 2-dimensional space, the coordinates of which are predicted 1.3.1. Raagas and Rasas using regression approaches. Han et al use One of the fundamental concepts of Carnatic music samples from an online songs database. music is that of a Raaga. Every composition is The samples are labeled according to a set to a particular raaga. The raaga of a song taxonomy used by the database and these are characterizes its melody. Each piece in a later transformed into labels for the 11 certain raaga uses a certain set of notes. There emotions they used. They pick 15 songs for are also rules which govern how the notes each of the 11 emotions they considered. interact with each other and which ones are Trohidis, Tsoumakas, Kalliris and Vlahavas prominent in the piece. Another element of a (2008) and Wieczorkowska, Synak and Ras raaga is the rasa or emotion. (2006) explore multi-label based classification approaches for predicting emotions induced Carnatic music theory tells that there are nine by music. Trohidis et al. (2008) consider 6 rasas or emotions conveyed by the music emotions which are derived from Tellegan- style. Certain raagas are associated with Watson-Clark model of mood. Each instance particular rasas. For example, Hamsadhwani is can correspond to more than one emotion in associated with joy while Kaanada is this approach. They use music samples from a associated with love custom-created corpus. Here the samples are (http://www.karnatik.com/rasas.shtml). Often, selected based on different genres. Experts bhakthi, or devotion, is considered another were asked to annotate the corpus with rasa. The nine rasas, used in Carnatic Music emotional labels in this case. Also, they theory are shown in Table 1. consider 30 sec segments of each sample for the feature extraction task. avarasa Meaning In his work, Chordia, (2007) tries to shringara romance/beauty empirically establish the relationship between Raagas and emotions induced on the listeners hasya comic/happiness in Hindustani or North Indian Classical Music. karuna pathetic/sad He considers 12 discrete emotions, 4 of which rudra angry are among the traditional emotions used in vira heroic classical theory. Each instance can correspond bhayanaka ridden by fear to more than one of these 12 emotions. He bibhatsa disgust adbhuta surprise/wonder 2
  • 3. Emotional Analysis of Carnatic Music: A Machine Learning Approach shanta peaceful evoke a multitude of emotions in the listener Table 1: Emotional components in Carnatic Music and hence cannot be represented using a single label. For example, the song “Enna Tavam” in 2. Problem Definition raga Kapi, is a devotional song praising a Our aim is to build a framework that Hindu God. Since it is also about a God who accomplishes two tasks: in the form of a child, there are strong notes of • Predict the emotions induced on a listener affection and love conveyed by the song. when the listener listens to the song. There is also a tinge of the melancholy of a • Given a song that the listener likes, retrieve mother whose son is growing up fast and a set of songs that might induce emotions proving to be beyond her comprehension. In similar to that of the given song. such cases, it is hard to label the song as We explore an expert system approach “happy” when there are many other emotions wherein the knowledge obtained from experts involved. In our quest for alternative in the field of Carnatic music serves as the representations, we came up with a novel way ground truth for various tasks in our system, to interpret the emotional state of a listener by including the information about emotions using a mixture of the classical emotions induced by a particular song. defined by Carnatic music theory. In order to address the problem described In this model, the emotional state experienced above, there are several sub-problems to by a listener is represented using a mixture of consider, the important ones being: the 10 Rasas, as defined by the Classical • How do we represent and annotate the music theory. Thus, the emotional state of a emotions induced by a song? listener can be visualized as a point in a 10- • How do we prepare a corpus of Carnatic dimensional Euclidian space, where each songs and get it annotated by experts? How dimension corresponds to a rasa. To further do we design a method which makes it easy illustrate the model, we could look at a few for the experts to annotate? examples below: • What features from audio signal might be • Happy(1) = Sringara (Romantic) + Shantha relevant for classifying the emotions (Peaceful) induced? A song in this state is romantic but either • How do we cluster emotionally similar elements of soothing peace and calm in it. songs together? How good/pure are these One example from the Carnatic music clusters? repertoire is “Mohamahinaen” in raga Karaharapriya. 3. Representation of Emotions • Happy(2) = Hasya (Joy) + Bhakthi For solving the problem at hand, it was crucial (Devotion) to come up with a system for representing “Vathapi Ganapathim” in Hamsadhwani is a emotions conveyed by Carnatic music. popular song that is joyful and energetic. Initially, we experimented with the two Yet it is devotional, praising Ganesha. dimensional Thayer’s model and a multi-class, • Excitement(1) = Bhakthi (Devotional) + single-label approach to emotion Rudra (Angry) + Adhbhuta (Wonder) + representation as used in Yang et al. (2008) Veera (Powerful) and Han et al. (2009). The results of this “Bho Shambho” in raaga Revathi is a song experiment were not promising. One of the that is powerful and intensely devotional, reasons for the failure might be due to the fact praising the God of destruction and that compositions in Carnatic music tend to portraying his omniscience. The power also 3
  • 4. Emotional Analysis of Carnatic Music: A Machine Learning Approach translates into anger occasionally, showing often slower exploration of the melody the dance of God and wondering at His leading into the main song. The experts were power. asked to annotate each of these song segments with ratings for each of the 10 Rasas. Further, we represent the value for each Distribution of Parent Ragas dimension as a rating between 0 (not present) in Corpus and 3 (very strongly present), as obtained from the expert. Thus, we can represent the Chitarambari emotion conveyed by each song as a 10- Kalyani dimensional vector, where each dimension can Dharmavathi take on values between 0 and 3, inclusive. In Gamanashrama order to test the validity of our representation Pantuvarali model, we performed a cluster analysis, the Shubhapanthuvarali results of which can be used to indicate that Varali Chalanta the representation chosen was indeed a good Shankarabharanam representation of emotions in Carnatic music. Harikhambhoji Karaharapriya 4. Corpus Natabhairavi In order to implement the machine learning Suryakantam approach, we needed a corpus which is Mayamalavagowla Natakapriya annotated with ground truth. In our research Thodi review, we failed to find any existing corpus for Carnatic music and hence built our own 0 10 20 30 corpus. We started off by selecting a few hundred songs from various Carnatic Number of Samples of Parent Raga compositions sung by various artists. One of Figure 1: Distribution of Parent Ragas in Corpus the team member, who is well versed in the theory of Carnatic Music, labeled each of the The rating had to be chosen from {0(not compositions with coarse-grained emotion present), 1(weakly present), 2(present), labels consisting of Happy, Sad, Peaceful and 3(strongly present)}. We then created a Devotional. We then selected 109 songs, Google Docs Spreadsheet which could be distributed equally over these 4 coarse grained used by the experts for annotation, as shown classes for our final corpus. in Figure 2. Next, we had to get the data annotated by The distribution of Raagas across various experts. We extracted a 30 sec (after initial 30 samples in the corpus can be seen in (Figure sec) segment from each song, uploaded it to a 1). publicly accessible repository (esnips.com). We ignored the first 30 seconds since it is Figure 2: Annotator Spreadsheet 4
  • 5. Emotional Analysis of Carnatic Music: A Machine Learning Approach 4.1. Agreement between labelers used in conjunction with classical dance to To check the agreement between the two convey emotions. Table 3 shows the number labelers, we looked at the Kappa statistic of disagreements in valence for the rasas as between the labelers. Cohen’s Kappa is seen in each parent raaga. calculated as Pr(a) − Pr(e) Melakartha umber of Disagreements of κ= 1− Pr(e) Adhbhuta Sringara where Pr(a) is the observed percentage Bhakthi Karuna Shanta agreement and Pr(e) is the probability of random agreement. In this case, since there are 4 possible ratings for any emotion, Pr(e) = Thodi 3 3 1 1 4 Natakapriya 4 4 2 4 3 0.25. The following table shows the Kappa Mayamalavagowl 4 2 1 3 1 statistic for each emotion. a Suryakantam 1 2 1 1 0 Emotion Meaning Kappa Statistic Natabhairavi 5 4 1 2 5 Karaharapriya 11 7 8 12 13 Bhakthi Devotion -0.1809 Harikhambhoji 11 1 11 12 9 Sringara Love 0.1365 Shankarabharana 13 4 18 20 20 Hasya Comedy/laughter 0.9492 m Raudra Anger 0.9873 Chalanta 2 0 2 3 0 Karuna Sadness 0.3015 Varali 2 3 2 2 3 Bhibhatsa Disgust 1 Shubhapanthuvar 1 1 0 1 0 Bhayanaka Fear 0.9873 ali Vira Heroism 0.2888 Pantuvarali 1 2 1 2 2 0.6063 Gamanashrama 2 1 0 1 2 Adhbhuta Wonder Dharmavathi 1 0 0 1 0 Shanta Peace 0.1365 Table 2: Kappa Statistic for each Rasa Kalyani 6 3 6 8 8 Chitarambari 0 0 1 1 0 The statistic is very high for Hasya, Raudra, Total 67 37 55 74 70 Bhibhatsa and Bhayanaka. This is the case Table 3: Disagreements by parent raga because those emotions are very rarely present in Carnatic music and hence both labelers 5. Experiments and Analysis typically labeled the samples as 0. The 5.1. Classification Task remaining emotions show significant Once we had the annotated corpus, the next disagreement. Each rater seems to perceive step was to train classifiers for predicting the different levels of the emotion in the samples. ratings for each of the dimensions or rasas. Our reasoning behind this is that different We chose to treat each of the dimensions as labelers have different perceptions of independent of each other and hence we had emotions. For instance Sringara can refer to to train different classifiers for each of the romantic love but often also refers to parental dimensions. Further, looking at the results of love, friendship or beauty. As a result of expert annotation, we decided to filter out numerous interpretations, the consistency some dimensions since they were rated as between raters decreases. Since only some of absent for most of the instances. We trained these emotions are used commonly in songs to classifiers for four different dimensions: convey emotions, we restricted to our Bhakthi, Sringara, Karuna, Shantha. The task experiments to the following emotions: was a multi-class (4) classification problem bhakthi, sringara, karuna, vira, adhbuta and since each of the Rasas could have discrete shanta. The remaining Rasas were usually 5
  • 6. Emotional Analysis of Carnatic Music: A Machine Learning Approach ratings of 0, 1, 2, or 3. Features. SVM Ripper Accuracy (%) Accuracy (%) Dynamics. 36.4 34.5 5.1.1 Classification Methodology Rhythm – Attack 37.4 30 We used WEKA (Hall et al., 2009) for our Rhythm – Tempo 34.6 34.5 experiments. For each of the dimension, we Rhythm – All 33.6 34.5 experimented with SVM and Ripper Pitch 43 29 classifiers. For choosing SVM model, we Timbre – MFCCs 37.4 35.5 varied the "C" parameter from 1x10-4 to (Mean, SD) Timbre – Mel 37.3 40 1x104. Spectrum Timbre – Others 45.8 34 5.1.2 Features Used (Mean, SD) We used MIRToolBox within Matlab Timbre – All 40 36.5 environment (Lartillot et al., 2007) for (Mean, SD) extracting the features from Audio signals. Tonal (Mean, SD) 41 43 All 37.4 40 Keeping in line with previous research in this Table 5: Dimension – Sringara field, we experimented with the following set Baseline: 34.6% (Major Value = 2) of features. • Dynamics – RMS Energy. Features SVM Ripper • Rhythm – Fluctuation, Tempo, Attack Time. Accuracy (%) Accuracy (%) Dynamics 60 57 • Pitch – Peak, Centroid calculated from Rhythm – Attack 57.9 57.9 Chromagram. Rhythm – Tempo 57.9 55 • Timbre – Spectral, Centroid, Skewness, Rhythm – All 57.9 57 Spread, Brightness, Flatness, Roughness, Pitch 57.9 57 Irregularity, MFCCs, Zero crossing, Spectral Timbre – MFCCs 57.9 56 Timbre – Mel 57.9 53 flux. Spectrum • Tonal – Key clarity, Key strengths (12). Timbre – All 57.9 50.5 Tonal (Mean, SD) 57.9 59 5.1.3 Classification Results All 57.9 51 The results reported below are the accuracy Table 6: Dimension – Karuna Baseline: 57.9% (Major Value = 0) value averages obtained using 10-fold cross validation technique. Features SVM Ripper Accuracy (%) Accuracy (%) Features SVM Ripper Dynamics 42.9 36.4 Accuracy (%) Accuracy (%) Rhythm – Attack 42.9 37.4 Dynamics 71 67 Rhythm – Tempo 49.6 43 Rhythm – Attack 71 68 Rhythm – All 42 54.2 Rhythm – Tempo 71 71 Pitch (Mean, SD) 47 50.5 Rhythm – All 71 71 Timbre – MFCCs 43 38 Pitch 71 68 Timbre – Mel 43 44 Timbre – MFCCs Spectrum (Mean, 71 64 SD) Timbre – Mel 71 63 Timbre – All 44 39 Spectrum Tonal 43 40 Timbre – All 70 64.5 All 44 39 Tonal 71 62 Table 7: Dimension – Shanta All 70 60 Baseline: 42.9% (Major Value = 1) Table 4 : Dimension – Bhakthi Baseline: 71% (Major Value = 3) 6
  • 7. Emotional Analysis of Carnatic Music: A Machine Learning Approach As we can see from the evaluation results, the • The clustering gives us a soft or fuzzy importance of each feature set and classifier approach for representing the emotions varies by the emotional dimension under conveyed by a song. Using these clusters, consideration. The best feature sets for each of we can retrieve songs that are emotionally the emotional dimensions are highlighted in similar to the given song. Without this green. The best performing feature sets for approach, emotional similarity would be each of the dimensions are in logical restricted to single labels like Happy, Sad agreement with those defined by the Carnatic etc. Hence, given a happy song, we could Music theory. only retrieve other Happy songs. But with this approach, we can locate emotionally 5.1.4 Analysis of results similar songs in the 10-dimensional space As we can see, the results are equal to or using any similarity metrics like Cosine better than baseline for all the emotional similarity, Euclidian distance, Manhattan dimensions considered. But for Bhakthi, the Distance, among others. results were not any better than the baseline. Two of the important factors responsible for 5.2.1 Clustering and Evaluation the low accuracies might be: In order to evaluate the purity of each cluster, • Sparse data in terms of number of annotated we need a metric that tells us how emotionally samples that were available (109 in total). similar the given songs are to each other. We • Limitations of the song segmentation explored a qualitative approach for evaluating technique used during feature extraction. the goodness or purity of the cluster. This This was basically 30 second segment after approach consists of associating each song initial 30 seconds. with the Melakartha Raaga it corresponds to. Once we have this, we could obtain the 5.2. Clustering information about emotional similarity of Once we have the values or ratings for each of Raagas using Carnatic Music theory. Thus, by the Rasas or dimensions, we can try clustering analyzing the Raaga distribution within each together songs based on their emotional cluster, we can qualitatively comment similarity. The motivation for this arises from regarding the cluster's purity or goodness. our hypothesis that the emotional state of a listener can be visualized as a point in 10- For the clustering task, we represented each dimensional Euclidean space. Therefore, input instance (song) as a 10-dimensional emotionally similar songs would occur close vector and we ran the EM algorithm in weka to each other in this 10-dimensional space. with the following parameters: The clustering task would basically help us “weka.clusterers.EM -I 100 -N -1 -M 1.0E-6 - with two things: S 100”. We analyzed each person’s labels • If the songs clustered together are indeed separately because of low rater agreement. We similar with respect to emotions induced on had initially tried to correlate the clusters to the listener (cluster purity is high), it would the raga labels but due to lack of enough verify our hypothesis that the 10 dimensions samples from many of the ragas (many had are indeed good for capturing the emotions only 1 sample present in the corpus), we corresponding to the song. decided to use its parent raga. 7
  • 8. Emotional Analysis of Carnatic Music: A Machine Learning Approach Every raga in Carnatic music is either a parent structures. Those derived from the same raga raga or a derived raga. Each derived raga is are more similar to each other that any two derived from one of 72 main ragas, the parent random ragas. ragas. Derived ragas share the same notes as their parent raga as well as certain melodic Number of samples per cluster (Rater 1) 25 Number of samples 20 15 10 5 0 Parent Raga Count of Cluster 0 Count of Cluster 1 Figure 2 - Raaga distribution across clusters 5.2.2 Analysis of clustering results found in Cluster 0 tend to be associated with We notice (Figure 3) that the distribution of sadness and peace. We can see Karaharapriya parent raga instances for each cluster is is split between the two clusters. Several ragas different. Since the cluster size is fixed to two, in the corpus were derived from we can hypothesize that the two clusters might Karaharapriya. One such raga is Kannada, logically correlate to roughly Happy and Sad which is associated with joyous emotions. emotions. This might be possible only if the Others such as Abheri are much sadder and 10-dimensional representation chosen is dimensional melancholic. Hence we see mixed clusters in indeed a good way to capture the emotions Karaharapriya. conveyed by the song. Looking at the distribution of the Raagas across clusters In figure 4, we can see how the different (Figure 3), we can notice that the Raagas emotions contribute towards each cluster. corresponding to Happy or Positive emotions Cluster 1 (in red) shows an absence of Karuna tend to co-occur predominantly in the same occur and much higher levels of Sringara while cluster, and those corresponding to the Cluster 0 (in blue) is sadder. negative emotions tend to co co-occur predominantly in the other (different) cluster. More specifically, Harikhambhoji an and Shankarabharanam have a large fraction of the samples in Cluster 1. In general, those ragas tend to be associated with more positive emotions such as happiness and joy. The ragas 8
  • 9. Emotional Analysis of Carnatic Music: A Machine Learning Approach music. We were also able to verify the appropriateness of this method through the cluster analysis procedure. Next, we tried . training classifiers to predict the values for each of the emotional dimensions, the results of which were better than the baseline. Therefore, we have a framework that could be efore, used for: • Predicting and locating songs according to ocating the emotions they induce on the listener. • Retrieving songs that are emotionally similar to the given song. We can attribute the relatively low improvements over baseline in the classification task to: • Sparsity of annotated samples. • Naive Segmentation technique used (30 sec after initial 30 sec) during feature extraction • Conflicts in Expert labeling (low Kappa scores). 6.1. Future Work If we were to improve on our work, we could ur start by trying to obtain more annotated samples from the Experts. The definition of each of the dimension as a Rasa should be made more explicit in order to decrease conflicts in labeling between experts and increase the Kappa score. We would also need ncrease to collect the annotation ratings from more than one expert in order to ensure statistical and logical consistency of the annotations. Also, we could explore using a more fine grained rating scale for each dimension and dimens see if it leads to improvement in clustering and classification accuracies. We need to also work on better segmentation strategies for extracting features. For example, we could try Figure 3: Distribution of Clusters across Rasas : considering the initial 30 sec, middle 30 sec and the final 30 sec segments of the given We can thus qualitatively argue that the song, rather than just one 30 sec segment. clusters induced are indeed good/pure. 6. Conclusion 6.2. Acknowledgements We would like to take this opportunity to In our work, we explored a novel method for express our heartfelt gratitude to everyone representing emotions conveyed by Carnatic who helped us in our work. Specifically, we 9
  • 10. Emotional Analysis of Carnatic Music: A Machine Learning Approach would like to thank the following people for music into emotions. In: Proceedings of the their invaluable suggestions and contributions: 9th International Conference on Music • Sapthagiri Iyengar, has been practicing Information Retrieval (ISMIR).(2008). Carnatic music for more than a decade now. He has also performed in various Carnatic 5. A. Wieczorkowska, P. Synak, and Z.W. music concerts. We consulted him for Ras. Multi-label classification of emotions in clarifications related to the Carnatic theory music. In Proceedings of the 2006 and he was also one of the Experts who International Conference on Intelligent volunteered to annotate the corpus. Many of Information Processing and Web Mining his invaluable suggestions have been (IIPWM’06), pages 307–315, 2006. incorporated into our present work. • Meena Ramachandran is a connoisseur of 6. Chordia, P. and Rae, A. 2008. Carnatic music. She was one of the experts Understanding Emotion in Raag: An who volunteered to annotate the corpus. Empirical Study of Listener Responses. In • Prof. Julia Hirschberg, Bob Coyne, Fadi Computer Music Modeling and Retrieval. Biadsy and all other members of the Speech Sense of Sounds: 4th international Lab and CS6998 course for their invaluable Symposium, CMMR 2007, Copenhagen, suggestions. Denmark, August 27-31, 2007. 7. How does a raga make you feel 7. References (http://www.karnatik.com/rasas.shtml) 1. Y.-H. Yang, Y.-C. Lin, Y.-F. Su, and H.-H. 8. Lartillot, O. & Toiviainen, P. (2007). MIR Chen, "A regression approach to music in Matlab (II): A Toolbox for Musical Feature emotion recognition," IEEE Transactions on Extraction From Audio. International Audio, Speech and Language Processing Conference on Music Information Retrieval, (TASLP), vol. 16, no. 2, pp. 448-457, Feb. Vienna, 2007. 2008. 9. Mark Hall, Eibe Frank, Geoffrey Holmes, 2. Byeong-jun Han, Seungmin Rho, Roger B. Bernhard Pfahringer, Peter Reutemann, Ian H. Dannenberg, and Eenjun Hwang, "SMERS: Witten (2009); The WEKA Data Mining Music Emotion Recognition using Support Software: An Update; SIGKDD Explorations, Vector Regression", International Society Volume 11, Issue 1. for Music Information Retrieval (ISMIR'09), Kobe, Japan, pp. 651-656, Oct. 26. 3. Arefin Huq, Juan Pablo Bello, Andy Sarroff, Jeff Berger and Robert Rowe, “Sourcetone: An Automated Music Emotion Recognition System”, Poster presentation. 10th International Society for Music Information Retrieval Conference, Kobe, Japan. October, 2009. 4. Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.: Multilabel classification of 10