SlideShare una empresa de Scribd logo
1 de 38
Descargar para leer sin conexión
A Unified Music Recommender System Using
Users’ Listening Habits and Semantics of Tags
Hyon Hee Kim
Department of Statistics and Information Science,
Dongduk Women’s University
Outline
• Motivation & Objectives
• Overview of the System
• Generation of User Profiles
• A Unified Music Recommendation
• Performance Evaluation
• Related Work
• Conclusions and Future Work
Motivation (1/3)
• In a Social Music Site
– Music recommendation is essential.
– Music recommendation is different from other product recommendation
• Explicit information : Rating system
• Implicit information : the number of plays
• Listening habits-based User Profiling
– Cold Start Problem
• A new users with little information
• A new items with only a few ratings
– Data Sparsity Problem
• Data is very small compared to needed music items
Classic rock
british
pop
rock
• Collaborative Tagging
– A tool for users to represent their preferences about web resources
– Users add keywords which are freely chosen by themselves to web resources
– Using tag data for user profiling in personalized recommender systems
• Tag-based User Profiling
– More Easily added tags without listening to music
– Semantically meaningful tags
Motivation (2/3)
Motivation (3/3)
• In the case of last.fm
• Factual Tags
– 85% of tags
– genre, region, instrumentation
• Emotional Tags
– 10% of tags
– opinion, sentiment, mood
• Personal Tags
– 5% of tags
– to organize, to browse, etc.
Objectives
• A Novel Approach to Music Recommendation
– Combining listening habits and semantics of tags
• Using a Tag Ontology and an Emotion Ontology
– UniTag: Resolving semantic ambiguity of tags
– UniEmotion: Assigning weighted values to the emotional tags
→ Semantically Enhanced Music Recommendation
Outline
• Motivation & Objectives
• Overview of the System
• Generation of User Profiles
• A Unified Music Recommendation
• Performance Evaluation
• Related Work
• Conclusions and Future Work
Overview of the System
Outline
• Motivation & Objectives
• Overview of the System
• Tag-based User Profiling
– Preprocessing of tags
– Algorithms for generating user profiles
– Preliminary experimental results
• A Unified Music Recommendation
• Performance Evaluation
• Related Work
• Conclusions and Future Work
Preprocessing of Tags (1/3)
• A tag does not have any pre-defined term or hierarchies of a term
• Problems of tag data
– Synonymy
• Different words represents the same meaning
• E.g., hiphop, hip-hop, hip hop/ R & B, Rhythm and Blues, Blues
– Polysemy
• A single word contains multiple meanings
• E.g., French => French rock, French pop, French artist
– Spelling variants
• misspelling
• Foreign language
Preprocessing of Tags (2/3)
• Tag Ontology
– Tags, users, items
• UniTag Ontology
– uniTag:Users
• uniTag:userID, uniTag:hasAdded, uniTag:hasAddedTo
– uniTag:Items
• uniTag:itemID
– uniTag:Tags
• uniTag:tagID, uniTag:tagName, uniTag:RTag, uniTag:subTag,
• uniTag:Rtags {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae}
• uniTag:classifiedAs, uniTag:isKindOf, uniTag:istheSameAs, uniTag:tagVariation
Preprocessing of Tags (3/3)
• Rules for reasoning prefix
– French rock, progressive rock, post rock=> rock
(Tag (?t) ^ tagPrefix (?t, ?p) ^ Prefix(?p) ^ subTag(?t, ?s) ^ Rtags (?s) ->
classifiedAs (?t, ?s)
• Rules for reasoning expert knowledge
– Soul => rhythm and blues, rhythm and blues => blues then Soul => blues
(Tag (?t) ^ isKindof (?t, ?A) ^ isKindof (?A, ?B) -> isKindof (?t, ?B)
• Rules for reasoning synonym
– Hip-hop, hiphop => hip hop
(Tag(?t) ^tagVariation (?t, ?R) ^ istheSameAs (?t, ?s) -> tagVariation (?s, ?R)
Algorithm for Generating User Profiles (1/2)
Algorithm 1. Generation of A Tag-based Profile
Input: set of Representative tags Tr, set of a user’s tag Tu
Output: set of frequencey for each representative tag of the user FTr
var RTags[] = {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae}
var tagFrequency[] = { }, tempFrequency [] = { }
var RTag = null
while ∃next tag t in Tu do
RTag = FindRTag (t)
If Rtag == RTags [i] then
{ tempFrequency[i] = tempFrequency[i] + 1
tagFrequency [i] = tempFrequency [i] }
else
tagFrequency [i] = tempFrequency [i]
endwhile rock hiphop electronic metal jazz rap funk folk blues reggae
user1 6 2 2 3 2 4 3 1 1 1
user2 5 0 0 0 0 0 0 0 1 0
user3 2 2 1 1 1 1 2 0 0 1
user4 10 1 0 1 2 0 2 3 3 1
user5 1 4 0 0 0 4 1 0 0 0
Table 1. An example of tag-based profiles
Algorithm for generating User Profiles (2/2)
Algorithm 2. Generation of A Track-based Profile
Input: set of tracks of a usr TRu, set of Representative tags Tr
Output: set of number of a user’s tracks for each representative musical genre Tn
var RTags[] = {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae}
var numTrack[ ] = { }, tempnumTrack [ ] = { }
var RTrack = null
while ∃next tag t in Tu do
RTrack = FindGenre (t)
If Rtrack == RTags [i] then
{ tempnumTrack [i] = tempnumTrack[i] + 1
numTrack[i] = tempnumTrack [i] }
else
numTrack [i] = tempnumTrack [i]
endwhile rock hiphop electronic metal jazz rap funk folk blues reggae
User1 65 176 5 4 0 168 0 3 0 0
User2 411 8 11 109 3 5 8 1 0 0
User3 157 7 11 10 6 2 1 39 4 2
User4 257 20 9 18 2 5 0 9 0 0
User5 110 277 15 8 6 85 10 3 2 7
Table 2. An example of track-based profiles
Preliminary Experimental Results (1/3)
• 1,000 user data set from Last.fm
– Users, tags, music items
• Standardization
– To remove extensive preference
• K-Means clustering algorithm
– Canopy Clustering
– 6 centroid points and 6 clusters
Preliminary Experimental Results (2/3)
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
Cluster1 0.241 1.472 0.626 0.130 1.267 1.621 2.168 0.274 1.078 0.381
Cluster2 2.171 0.032 0.517 3.052 0.011 -0.030 0.328 1.533 1.245 0.162
Cluster3 -0.206 -0.273 -0.517 -0.178 -0.180 -0.294 -0.233 -0.171 -0.204 -0.136
Cluster4 -0.341 0.660 -0.459 -0.284 -0.208 1.178 -0.179 -0.321 -0.166 0.273
Cluster5 -0.074 -0.155 1.320 -0.230 -0.115 -0.261 -0.209 -0.070 -0.172 -0.071
Cluster6 2.815 7.640 5.168 -0.136 9.254 6.135 7.000 4.286 4.421 5.254
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
Cluster1 -0.411 0.495 0.406 -0.338 1.565 0.131 1.632 -0.135 0.147 0.812
Cluster2 0.200 -0.444 0.007 -0.341 0.907 -0.468 -0.288 2.617 1.097 0.020
Cluster3 -0.897 1.651 -0.539 -0.442 -0.213 1.836 0.059 -0.507 -0.415 0.034
Cluster4 1.925 -0.590 -0.404 0.852 -0.264 -0.491 0.655 -0.002 2.850 -0.108
Cluster5 0.914 -0.557 -0.216 0.794 -0.296 -0.511 -0.297 0.014 -0.157 -0.147
Cluster6 -0.472 -0.327 0.380 -0.373 -0.184 -0.371 -0.241 -0.205 -0.300 -0.093
Table 3. Values of Centers of Tag-based Profiles
Table 4. Values of Centers of Track-based Profiles
• Clustering Validity
– Inter-cluster distances
– Distances between all pairs of centroids using cosine distance measure
Preliminary Experimental Results (3/3)
– T-test
• Mean of inter-cluster distances of tag-based profiles
• Mean of inter-cluster distances of track-based profiles
N Mean Std Dev t p-value
Tag-based profiles 15 0.8325 0.6834
2.55 0.0165
Track-based profiles 15 0.3785 0.0885
Table 5. T-test result for the means of inter-cluster distances
Outline
• Motivation & Objectives
• Overview of the System
• Generation of User Profiles
• A Unified Music Recommendation
– UniEmotion Ontology
– Generation of User Profiles
– Music Recommendation Algorithm
• Performance Evaluation
• Related Work
• Conclusions and Future Work
UniEmotion Ontology (1/5)
[Plutchik’s model]
UniEmotion Ontology (2/5)
P: 0.625, O: 0.25, N: 0.125
P: 0.375, O: 0.625, N: 0
P: 1.0, O: 0, N: 0
• Definition of the intensity of emotional tags
• SentiWordNet, http://sentiwordnet.isti.cnr.it/
UniEmotion Ontology (3/5)
• Intensity of emotional tags
– Strong
• Positive value >= 0.75 or Negative value>= 0.75
– Middle
• 0.25 <= Positive value <= 0.75 or
• 0.25 <= Negative value <= 0.75
– Weak
• Positive value < 0.25 and Negative value < 0.25
UniEmotion Ontology (4/5)
• Assigning the weights to the tags
– Factual tags: 1
– Positive tags
• Strong: 2.5
• Middle: 2
• Weak: 1.5
– Negative tags
• Strong: -2.5
• Middle: -2
• Weak: -1.5
• Final score of an item => sum of the weights
UniEmotion Ontology (5/5)
• Two classes
– UniEmotion:Positive
• Emotional tags belonging to the positive emotional categories
• trust, surprise, anticipation, and happiness
– UniEmotion:Negative
• Emotional tags belonging to the negative emotional categories
• disgust, anger, fear, and sadness
• Two properties
– UniEmotion:Intensity
• Specifying the intensity of tags
– UniEmotion:Weight
• Specifying the weight of tags
Generation of User Profiles (1/2)
1. Listening habits-based User Profiles
– U1 = {u1, u2, …, um}, I1 = {i1, i2, …, in},
– <u, I, n>
• N: number of plays
2. Tag score-based User Profiles
– U2 = {u1, u2, …, um}, I2 = {i1, i2, …, in},
– <u, I, s>
• S: scores of tags assigned by UniEmotion ontology
3. Hybrid User Profiles
– U3 = {u1, u2, …, um}, I3 = I1 ∩ I2,
– <u, I, m>
• M = α * n +(1- α) * s; α = 0.5
Generation of User Profiles (2/2)
1. Listening habits-based
User profiles
2. Tag score-based
User profiles
3. Hybrid
User profiles
Music Recommendation Algorithm (1/2)
• Finding Similar Users
– Pearson Correlation Similarity
• Calculating scores of items
– Considering the similar users’ rates
• Recommending top n items
Music Recommendation Algorithm (2/2)
Input: a set of user profiles UP
Output: a set of recommended items RI
1. For all yi ∈ U
Compute a similarity s between X and yi.
2. Sort by similarity
3. Select top n neighbors
4.
5. For all
Compute a similarity t between x and
For all
preference +=t * pref
6. Rank by preference
7. Select top n items
Outline
• Motivation & Objectives
• Overview of the System
• Generation of User Profiles
• A Unified Music Recommendation
• Performance Evaluation
• Related Work
• Conclusions and Future Work
Performance Evaluation
• Implementation Environment: Apache Web Server
– User database : MySQL 5.0
– Listening habits collector, tag score generator: PHP
– Recommendation Engine: Apache Mahout
– UniTag and UniEmotion Ontology: JDK6.0
• Experimental Data
– 1, 000 user information from last.fm [http://mir.dcs.gla.ac.uk/]
– Containing 18,700 artist and 12,600 tags
– 70% training data, 30% test data
Performance Evaluation
• Evaluation Model
– Recommended items
• Items which users are interested in (True Positive, TP)
• Items which users are not (False Positive, FP)
– Items which are not recommended
• Items which users are interested in (False Negative, FN)
• Items which users are not interested in (True Negative, TN)
– Precision P = TP/ TP+ FP
• # of correct recommendation/# of all recommended items
– Recall R = TP / TP+FN
• # of correct recommendation/# of preferred items
– F-measure F = 2* P* R / P+R
• Harmonic average between precision and recall
Experimental Results (1/3)
• Precisions
[Number of similar users] [Number of recommended items]
A: Listening habits-based approach
B: Tag-based approach
C: Hybrid approach
Experimental Results (2/3)
• Recalls
[Number of similar users] [Number of recommended items]
A: Listening habits-based approach
B: Tag-based approach
C: Hybrid approach
Experimental Results (3/3)
• F-measure
[Number of similar users] [Number of recommended items]
A: Listening habits-based approach
B: Tag-based approach
C: Hybrid approach
Statistical Validation
• One-way ANOVA about three groups
– Method1: listening habits-based approach
– Method2: tag-based approach
– Method3: hybrid approach
• Tukey Multiple Comparison Test
– Asymmetric distributions
• Log transformation
– Different characters in case two groups have significant
difference
Method 1 2 3 F
Mean of log(prec) -3.962B -4.036B -2.879A 34.27***
Mean
Precision(SD)
0.020
(0.006)
0.020
(0.009)
0.068
(0.040)
N 24 24 24
Method 1 2 3 F
Mean of log(recall) -3.285B -4.099c -2.635A 26.80***
Mean
Recall (SD)
0.044
(0.023)
0.019
(0.010)
0.093
(0.056)
N 24 24 24
<Table1. test for precision> ***: p<0.001
<Table2. test for recall> ***:p<0.001
Method 1 2 3 F
Mean of log(F-measure) -3.748B -4.117c -2.894A 41.31***
Mean
F-measure (SD)
0.024
(0.006)
0.018
(0.008)
0.06
(0.034)
N 24 24 24
<Table2. test for F-measure> ***: p<0.001
Related Work
• MusicBox
– A personalized music recommender system based on social tags
– 3-order tensors model
– The method improves the recommendation quality
• Foafing the music
– Collecting music information in a semantic web environment
– User information, music information, concert information
– Recommendation of similar music items
• OntoEmotions
– An ontology of emotional categories covering the basic emotions
– Armeteo art portal
– New relations can be inferred by reasoning on the ontology of emotions
Conclusions
• Solution to Cold Start Problem
– It takes time to collect users’ listening habits.
– Adding tags is easily done
– Tags look like word-of-mouth
• Performance Enhancement
– Precision, Recall, F-measure
– Hybrid approach > listening habits-based approach, tag-based approach
Future Work
• Elaborating UniEmotion Ontology
– Emerging Internet Slangs
• Item Selection
– Product Network Analysis Considering Tags
– Analyzing short description

Más contenido relacionado

Similar a Data science-2013-heekim

Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Music
Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about MusicCrowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Music
Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Musicmultimediaeval
 
IRJET- A Personalized Music Recommendation System
IRJET- A Personalized Music Recommendation SystemIRJET- A Personalized Music Recommendation System
IRJET- A Personalized Music Recommendation SystemIRJET Journal
 
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...icwe2015
 
Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsChris Johnson
 
音楽の非専門家が演奏・創作を通じて音楽を楽しめる世界を目指して
音楽の非専門家が演奏・創作を通じて音楽を楽しめる世界を目指して音楽の非専門家が演奏・創作を通じて音楽を楽しめる世界を目指して
音楽の非専門家が演奏・創作を通じて音楽を楽しめる世界を目指してkthrlab
 
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...IRJET Journal
 
Random Walk with Restart for Automatic Playlist Continuation and Query-specif...
Random Walk with Restart for Automatic Playlist Continuation and Query-specif...Random Walk with Restart for Automatic Playlist Continuation and Query-specif...
Random Walk with Restart for Automatic Playlist Continuation and Query-specif...Timo van Niedek
 
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...Dominik Kowald
 
Deep Learning Based Music Recommendation System
Deep Learning Based Music Recommendation SystemDeep Learning Based Music Recommendation System
Deep Learning Based Music Recommendation SystemIRJET Journal
 
[221]똑똑한 인공지능 dj 비서 clova music
[221]똑똑한 인공지능 dj 비서 clova music[221]똑똑한 인공지능 dj 비서 clova music
[221]똑똑한 인공지능 dj 비서 clova musicNAVER D2
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisStuart Wrigley
 
The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recog...
The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recog...The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recog...
The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recog...multimediaeval
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceDaichi Kitamura
 
Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Karthik Murugesan
 
MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...
MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...
MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...Joe McCarthy
 
AI&BigData Lab 2016. Игорь Костюк: Как приручить музыкальную рекомендательную...
AI&BigData Lab 2016. Игорь Костюк: Как приручить музыкальную рекомендательную...AI&BigData Lab 2016. Игорь Костюк: Как приручить музыкальную рекомендательную...
AI&BigData Lab 2016. Игорь Костюк: Как приручить музыкальную рекомендательную...GeeksLab Odessa
 
Quettra Design Problem Solution - Deepti Chafekar
Quettra Design Problem Solution - Deepti ChafekarQuettra Design Problem Solution - Deepti Chafekar
Quettra Design Problem Solution - Deepti Chafekarquettra
 

Similar a Data science-2013-heekim (20)

Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Music
Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about MusicCrowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Music
Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Music
 
IRJET- A Personalized Music Recommendation System
IRJET- A Personalized Music Recommendation SystemIRJET- A Personalized Music Recommendation System
IRJET- A Personalized Music Recommendation System
 
MULHER@AVI2012
MULHER@AVI2012MULHER@AVI2012
MULHER@AVI2012
 
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
 
Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music Recommendations
 
音楽の非専門家が演奏・創作を通じて音楽を楽しめる世界を目指して
音楽の非専門家が演奏・創作を通じて音楽を楽しめる世界を目指して音楽の非専門家が演奏・創作を通じて音楽を楽しめる世界を目指して
音楽の非専門家が演奏・創作を通じて音楽を楽しめる世界を目指して
 
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
 
Random Walk with Restart for Automatic Playlist Continuation and Query-specif...
Random Walk with Restart for Automatic Playlist Continuation and Query-specif...Random Walk with Restart for Automatic Playlist Continuation and Query-specif...
Random Walk with Restart for Automatic Playlist Continuation and Query-specif...
 
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
 
Deep Learning Based Music Recommendation System
Deep Learning Based Music Recommendation SystemDeep Learning Based Music Recommendation System
Deep Learning Based Music Recommendation System
 
[221]똑똑한 인공지능 dj 비서 clova music
[221]똑똑한 인공지능 dj 비서 clova music[221]똑똑한 인공지능 dj 비서 clova music
[221]똑똑한 인공지능 dj 비서 clova music
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log Analysis
 
The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recog...
The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recog...The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recog...
The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recog...
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
 
Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Trends in Music Recommendations 2018
Trends in Music Recommendations 2018
 
MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...
MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...
MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...
 
Kaggle kenneth
Kaggle kennethKaggle kenneth
Kaggle kenneth
 
AI&BigData Lab 2016. Игорь Костюк: Как приручить музыкальную рекомендательную...
AI&BigData Lab 2016. Игорь Костюк: Как приручить музыкальную рекомендательную...AI&BigData Lab 2016. Игорь Костюк: Как приручить музыкальную рекомендательную...
AI&BigData Lab 2016. Игорь Костюк: Как приручить музыкальную рекомендательную...
 
Quettra Design Problem Solution - Deepti Chafekar
Quettra Design Problem Solution - Deepti ChafekarQuettra Design Problem Solution - Deepti Chafekar
Quettra Design Problem Solution - Deepti Chafekar
 
KD_MB_MW_poster
KD_MB_MW_posterKD_MB_MW_poster
KD_MB_MW_poster
 

Más de Haklae Kim

The Semantic Web and Linked Open Data
The Semantic Web and Linked Open DataThe Semantic Web and Linked Open Data
The Semantic Web and Linked Open DataHaklae Kim
 
OKFN Korea 소개자료
OKFN Korea 소개자료OKFN Korea 소개자료
OKFN Korea 소개자료Haklae Kim
 
센서데이터 웹으로의 비상
센서데이터 웹으로의 비상센서데이터 웹으로의 비상
센서데이터 웹으로의 비상Haklae Kim
 
공공데이터 개방현황 및 포털 발전방향
공공데이터 개방현황 및 포털 발전방향공공데이터 개방현황 및 포털 발전방향
공공데이터 개방현황 및 포털 발전방향Haklae Kim
 
개인건강기록관리 플랫폼에서 링크드 데이터의 활용
개인건강기록관리 플랫폼에서  링크드 데이터의 활용 개인건강기록관리 플랫폼에서  링크드 데이터의 활용
개인건강기록관리 플랫폼에서 링크드 데이터의 활용 Haklae Kim
 
Extended open data and big data in public sector
Extended open data and big data in public sectorExtended open data and big data in public sector
Extended open data and big data in public sectorHaklae Kim
 
대한민국, 잇다!
대한민국, 잇다! 대한민국, 잇다!
대한민국, 잇다! Haklae Kim
 
Linked Data 이야기
Linked Data 이야기Linked Data 이야기
Linked Data 이야기Haklae Kim
 
Linked Data 이야기
Linked Data 이야기Linked Data 이야기
Linked Data 이야기Haklae Kim
 
오픈 데이터 현황과 과제
오픈 데이터 현황과 과제오픈 데이터 현황과 과제
오픈 데이터 현황과 과제Haklae Kim
 
서울시 링크드 데이터 서비스 사례 소개-모델링
서울시 링크드 데이터 서비스 사례 소개-모델링서울시 링크드 데이터 서비스 사례 소개-모델링
서울시 링크드 데이터 서비스 사례 소개-모델링Haklae Kim
 
서울시 링크드 데이터 서비스 사례 소개-모델링개요
서울시 링크드 데이터 서비스 사례 소개-모델링개요서울시 링크드 데이터 서비스 사례 소개-모델링개요
서울시 링크드 데이터 서비스 사례 소개-모델링개요Haklae Kim
 
서울시 Linked Data 서비스 소개-열린데이터광장
서울시 Linked Data 서비스 소개-열린데이터광장서울시 Linked Data 서비스 소개-열린데이터광장
서울시 Linked Data 서비스 소개-열린데이터광장Haklae Kim
 
서울시 링크드 데이터 서비스 소개-Overview
서울시 링크드 데이터 서비스 소개-Overview서울시 링크드 데이터 서비스 소개-Overview
서울시 링크드 데이터 서비스 소개-OverviewHaklae Kim
 
오픈 데이터에서 링크드 데이터로 진화
오픈 데이터에서 링크드 데이터로 진화 오픈 데이터에서 링크드 데이터로 진화
오픈 데이터에서 링크드 데이터로 진화 Haklae Kim
 
오픈 데이터에서 링크드 데이터로 진화
오픈 데이터에서 링크드 데이터로 진화 오픈 데이터에서 링크드 데이터로 진화
오픈 데이터에서 링크드 데이터로 진화 Haklae Kim
 
Data science (조명대)
Data science (조명대)Data science (조명대)
Data science (조명대)Haklae Kim
 
Open Data and Linked Data
Open Data and Linked Data Open Data and Linked Data
Open Data and Linked Data Haklae Kim
 
시민이 함께 만들어가는 서울 열린 데이터광장
시민이 함께 만들어가는 서울 열린 데이터광장시민이 함께 만들어가는 서울 열린 데이터광장
시민이 함께 만들어가는 서울 열린 데이터광장Haklae Kim
 
시민이 함께 만들어가는 서울 열린 데이터광장(서울시청 임성우)
시민이 함께 만들어가는 서울 열린 데이터광장(서울시청 임성우)시민이 함께 만들어가는 서울 열린 데이터광장(서울시청 임성우)
시민이 함께 만들어가는 서울 열린 데이터광장(서울시청 임성우)Haklae Kim
 

Más de Haklae Kim (20)

The Semantic Web and Linked Open Data
The Semantic Web and Linked Open DataThe Semantic Web and Linked Open Data
The Semantic Web and Linked Open Data
 
OKFN Korea 소개자료
OKFN Korea 소개자료OKFN Korea 소개자료
OKFN Korea 소개자료
 
센서데이터 웹으로의 비상
센서데이터 웹으로의 비상센서데이터 웹으로의 비상
센서데이터 웹으로의 비상
 
공공데이터 개방현황 및 포털 발전방향
공공데이터 개방현황 및 포털 발전방향공공데이터 개방현황 및 포털 발전방향
공공데이터 개방현황 및 포털 발전방향
 
개인건강기록관리 플랫폼에서 링크드 데이터의 활용
개인건강기록관리 플랫폼에서  링크드 데이터의 활용 개인건강기록관리 플랫폼에서  링크드 데이터의 활용
개인건강기록관리 플랫폼에서 링크드 데이터의 활용
 
Extended open data and big data in public sector
Extended open data and big data in public sectorExtended open data and big data in public sector
Extended open data and big data in public sector
 
대한민국, 잇다!
대한민국, 잇다! 대한민국, 잇다!
대한민국, 잇다!
 
Linked Data 이야기
Linked Data 이야기Linked Data 이야기
Linked Data 이야기
 
Linked Data 이야기
Linked Data 이야기Linked Data 이야기
Linked Data 이야기
 
오픈 데이터 현황과 과제
오픈 데이터 현황과 과제오픈 데이터 현황과 과제
오픈 데이터 현황과 과제
 
서울시 링크드 데이터 서비스 사례 소개-모델링
서울시 링크드 데이터 서비스 사례 소개-모델링서울시 링크드 데이터 서비스 사례 소개-모델링
서울시 링크드 데이터 서비스 사례 소개-모델링
 
서울시 링크드 데이터 서비스 사례 소개-모델링개요
서울시 링크드 데이터 서비스 사례 소개-모델링개요서울시 링크드 데이터 서비스 사례 소개-모델링개요
서울시 링크드 데이터 서비스 사례 소개-모델링개요
 
서울시 Linked Data 서비스 소개-열린데이터광장
서울시 Linked Data 서비스 소개-열린데이터광장서울시 Linked Data 서비스 소개-열린데이터광장
서울시 Linked Data 서비스 소개-열린데이터광장
 
서울시 링크드 데이터 서비스 소개-Overview
서울시 링크드 데이터 서비스 소개-Overview서울시 링크드 데이터 서비스 소개-Overview
서울시 링크드 데이터 서비스 소개-Overview
 
오픈 데이터에서 링크드 데이터로 진화
오픈 데이터에서 링크드 데이터로 진화 오픈 데이터에서 링크드 데이터로 진화
오픈 데이터에서 링크드 데이터로 진화
 
오픈 데이터에서 링크드 데이터로 진화
오픈 데이터에서 링크드 데이터로 진화 오픈 데이터에서 링크드 데이터로 진화
오픈 데이터에서 링크드 데이터로 진화
 
Data science (조명대)
Data science (조명대)Data science (조명대)
Data science (조명대)
 
Open Data and Linked Data
Open Data and Linked Data Open Data and Linked Data
Open Data and Linked Data
 
시민이 함께 만들어가는 서울 열린 데이터광장
시민이 함께 만들어가는 서울 열린 데이터광장시민이 함께 만들어가는 서울 열린 데이터광장
시민이 함께 만들어가는 서울 열린 데이터광장
 
시민이 함께 만들어가는 서울 열린 데이터광장(서울시청 임성우)
시민이 함께 만들어가는 서울 열린 데이터광장(서울시청 임성우)시민이 함께 만들어가는 서울 열린 데이터광장(서울시청 임성우)
시민이 함께 만들어가는 서울 열린 데이터광장(서울시청 임성우)
 

Último

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Data science-2013-heekim

  • 1. A Unified Music Recommender System Using Users’ Listening Habits and Semantics of Tags Hyon Hee Kim Department of Statistics and Information Science, Dongduk Women’s University
  • 2. Outline • Motivation & Objectives • Overview of the System • Generation of User Profiles • A Unified Music Recommendation • Performance Evaluation • Related Work • Conclusions and Future Work
  • 3. Motivation (1/3) • In a Social Music Site – Music recommendation is essential. – Music recommendation is different from other product recommendation • Explicit information : Rating system • Implicit information : the number of plays • Listening habits-based User Profiling – Cold Start Problem • A new users with little information • A new items with only a few ratings – Data Sparsity Problem • Data is very small compared to needed music items
  • 4. Classic rock british pop rock • Collaborative Tagging – A tool for users to represent their preferences about web resources – Users add keywords which are freely chosen by themselves to web resources – Using tag data for user profiling in personalized recommender systems • Tag-based User Profiling – More Easily added tags without listening to music – Semantically meaningful tags Motivation (2/3)
  • 5. Motivation (3/3) • In the case of last.fm • Factual Tags – 85% of tags – genre, region, instrumentation • Emotional Tags – 10% of tags – opinion, sentiment, mood • Personal Tags – 5% of tags – to organize, to browse, etc.
  • 6. Objectives • A Novel Approach to Music Recommendation – Combining listening habits and semantics of tags • Using a Tag Ontology and an Emotion Ontology – UniTag: Resolving semantic ambiguity of tags – UniEmotion: Assigning weighted values to the emotional tags → Semantically Enhanced Music Recommendation
  • 7. Outline • Motivation & Objectives • Overview of the System • Generation of User Profiles • A Unified Music Recommendation • Performance Evaluation • Related Work • Conclusions and Future Work
  • 9. Outline • Motivation & Objectives • Overview of the System • Tag-based User Profiling – Preprocessing of tags – Algorithms for generating user profiles – Preliminary experimental results • A Unified Music Recommendation • Performance Evaluation • Related Work • Conclusions and Future Work
  • 10. Preprocessing of Tags (1/3) • A tag does not have any pre-defined term or hierarchies of a term • Problems of tag data – Synonymy • Different words represents the same meaning • E.g., hiphop, hip-hop, hip hop/ R & B, Rhythm and Blues, Blues – Polysemy • A single word contains multiple meanings • E.g., French => French rock, French pop, French artist – Spelling variants • misspelling • Foreign language
  • 11. Preprocessing of Tags (2/3) • Tag Ontology – Tags, users, items • UniTag Ontology – uniTag:Users • uniTag:userID, uniTag:hasAdded, uniTag:hasAddedTo – uniTag:Items • uniTag:itemID – uniTag:Tags • uniTag:tagID, uniTag:tagName, uniTag:RTag, uniTag:subTag, • uniTag:Rtags {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae} • uniTag:classifiedAs, uniTag:isKindOf, uniTag:istheSameAs, uniTag:tagVariation
  • 12. Preprocessing of Tags (3/3) • Rules for reasoning prefix – French rock, progressive rock, post rock=> rock (Tag (?t) ^ tagPrefix (?t, ?p) ^ Prefix(?p) ^ subTag(?t, ?s) ^ Rtags (?s) -> classifiedAs (?t, ?s) • Rules for reasoning expert knowledge – Soul => rhythm and blues, rhythm and blues => blues then Soul => blues (Tag (?t) ^ isKindof (?t, ?A) ^ isKindof (?A, ?B) -> isKindof (?t, ?B) • Rules for reasoning synonym – Hip-hop, hiphop => hip hop (Tag(?t) ^tagVariation (?t, ?R) ^ istheSameAs (?t, ?s) -> tagVariation (?s, ?R)
  • 13. Algorithm for Generating User Profiles (1/2) Algorithm 1. Generation of A Tag-based Profile Input: set of Representative tags Tr, set of a user’s tag Tu Output: set of frequencey for each representative tag of the user FTr var RTags[] = {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae} var tagFrequency[] = { }, tempFrequency [] = { } var RTag = null while ∃next tag t in Tu do RTag = FindRTag (t) If Rtag == RTags [i] then { tempFrequency[i] = tempFrequency[i] + 1 tagFrequency [i] = tempFrequency [i] } else tagFrequency [i] = tempFrequency [i] endwhile rock hiphop electronic metal jazz rap funk folk blues reggae user1 6 2 2 3 2 4 3 1 1 1 user2 5 0 0 0 0 0 0 0 1 0 user3 2 2 1 1 1 1 2 0 0 1 user4 10 1 0 1 2 0 2 3 3 1 user5 1 4 0 0 0 4 1 0 0 0 Table 1. An example of tag-based profiles
  • 14. Algorithm for generating User Profiles (2/2) Algorithm 2. Generation of A Track-based Profile Input: set of tracks of a usr TRu, set of Representative tags Tr Output: set of number of a user’s tracks for each representative musical genre Tn var RTags[] = {rock, hiphop, electronic, metal, jazz, rap, funk, folk, blues, reggae} var numTrack[ ] = { }, tempnumTrack [ ] = { } var RTrack = null while ∃next tag t in Tu do RTrack = FindGenre (t) If Rtrack == RTags [i] then { tempnumTrack [i] = tempnumTrack[i] + 1 numTrack[i] = tempnumTrack [i] } else numTrack [i] = tempnumTrack [i] endwhile rock hiphop electronic metal jazz rap funk folk blues reggae User1 65 176 5 4 0 168 0 3 0 0 User2 411 8 11 109 3 5 8 1 0 0 User3 157 7 11 10 6 2 1 39 4 2 User4 257 20 9 18 2 5 0 9 0 0 User5 110 277 15 8 6 85 10 3 2 7 Table 2. An example of track-based profiles
  • 15. Preliminary Experimental Results (1/3) • 1,000 user data set from Last.fm – Users, tags, music items • Standardization – To remove extensive preference • K-Means clustering algorithm – Canopy Clustering – 6 centroid points and 6 clusters
  • 16. Preliminary Experimental Results (2/3) X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 Cluster1 0.241 1.472 0.626 0.130 1.267 1.621 2.168 0.274 1.078 0.381 Cluster2 2.171 0.032 0.517 3.052 0.011 -0.030 0.328 1.533 1.245 0.162 Cluster3 -0.206 -0.273 -0.517 -0.178 -0.180 -0.294 -0.233 -0.171 -0.204 -0.136 Cluster4 -0.341 0.660 -0.459 -0.284 -0.208 1.178 -0.179 -0.321 -0.166 0.273 Cluster5 -0.074 -0.155 1.320 -0.230 -0.115 -0.261 -0.209 -0.070 -0.172 -0.071 Cluster6 2.815 7.640 5.168 -0.136 9.254 6.135 7.000 4.286 4.421 5.254 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 Cluster1 -0.411 0.495 0.406 -0.338 1.565 0.131 1.632 -0.135 0.147 0.812 Cluster2 0.200 -0.444 0.007 -0.341 0.907 -0.468 -0.288 2.617 1.097 0.020 Cluster3 -0.897 1.651 -0.539 -0.442 -0.213 1.836 0.059 -0.507 -0.415 0.034 Cluster4 1.925 -0.590 -0.404 0.852 -0.264 -0.491 0.655 -0.002 2.850 -0.108 Cluster5 0.914 -0.557 -0.216 0.794 -0.296 -0.511 -0.297 0.014 -0.157 -0.147 Cluster6 -0.472 -0.327 0.380 -0.373 -0.184 -0.371 -0.241 -0.205 -0.300 -0.093 Table 3. Values of Centers of Tag-based Profiles Table 4. Values of Centers of Track-based Profiles • Clustering Validity – Inter-cluster distances – Distances between all pairs of centroids using cosine distance measure
  • 17. Preliminary Experimental Results (3/3) – T-test • Mean of inter-cluster distances of tag-based profiles • Mean of inter-cluster distances of track-based profiles N Mean Std Dev t p-value Tag-based profiles 15 0.8325 0.6834 2.55 0.0165 Track-based profiles 15 0.3785 0.0885 Table 5. T-test result for the means of inter-cluster distances
  • 18. Outline • Motivation & Objectives • Overview of the System • Generation of User Profiles • A Unified Music Recommendation – UniEmotion Ontology – Generation of User Profiles – Music Recommendation Algorithm • Performance Evaluation • Related Work • Conclusions and Future Work
  • 20. UniEmotion Ontology (2/5) P: 0.625, O: 0.25, N: 0.125 P: 0.375, O: 0.625, N: 0 P: 1.0, O: 0, N: 0 • Definition of the intensity of emotional tags • SentiWordNet, http://sentiwordnet.isti.cnr.it/
  • 21. UniEmotion Ontology (3/5) • Intensity of emotional tags – Strong • Positive value >= 0.75 or Negative value>= 0.75 – Middle • 0.25 <= Positive value <= 0.75 or • 0.25 <= Negative value <= 0.75 – Weak • Positive value < 0.25 and Negative value < 0.25
  • 22. UniEmotion Ontology (4/5) • Assigning the weights to the tags – Factual tags: 1 – Positive tags • Strong: 2.5 • Middle: 2 • Weak: 1.5 – Negative tags • Strong: -2.5 • Middle: -2 • Weak: -1.5 • Final score of an item => sum of the weights
  • 23. UniEmotion Ontology (5/5) • Two classes – UniEmotion:Positive • Emotional tags belonging to the positive emotional categories • trust, surprise, anticipation, and happiness – UniEmotion:Negative • Emotional tags belonging to the negative emotional categories • disgust, anger, fear, and sadness • Two properties – UniEmotion:Intensity • Specifying the intensity of tags – UniEmotion:Weight • Specifying the weight of tags
  • 24. Generation of User Profiles (1/2) 1. Listening habits-based User Profiles – U1 = {u1, u2, …, um}, I1 = {i1, i2, …, in}, – <u, I, n> • N: number of plays 2. Tag score-based User Profiles – U2 = {u1, u2, …, um}, I2 = {i1, i2, …, in}, – <u, I, s> • S: scores of tags assigned by UniEmotion ontology 3. Hybrid User Profiles – U3 = {u1, u2, …, um}, I3 = I1 ∩ I2, – <u, I, m> • M = α * n +(1- α) * s; α = 0.5
  • 25. Generation of User Profiles (2/2) 1. Listening habits-based User profiles 2. Tag score-based User profiles 3. Hybrid User profiles
  • 26. Music Recommendation Algorithm (1/2) • Finding Similar Users – Pearson Correlation Similarity • Calculating scores of items – Considering the similar users’ rates • Recommending top n items
  • 27. Music Recommendation Algorithm (2/2) Input: a set of user profiles UP Output: a set of recommended items RI 1. For all yi ∈ U Compute a similarity s between X and yi. 2. Sort by similarity 3. Select top n neighbors 4. 5. For all Compute a similarity t between x and For all preference +=t * pref 6. Rank by preference 7. Select top n items
  • 28. Outline • Motivation & Objectives • Overview of the System • Generation of User Profiles • A Unified Music Recommendation • Performance Evaluation • Related Work • Conclusions and Future Work
  • 29. Performance Evaluation • Implementation Environment: Apache Web Server – User database : MySQL 5.0 – Listening habits collector, tag score generator: PHP – Recommendation Engine: Apache Mahout – UniTag and UniEmotion Ontology: JDK6.0 • Experimental Data – 1, 000 user information from last.fm [http://mir.dcs.gla.ac.uk/] – Containing 18,700 artist and 12,600 tags – 70% training data, 30% test data
  • 30. Performance Evaluation • Evaluation Model – Recommended items • Items which users are interested in (True Positive, TP) • Items which users are not (False Positive, FP) – Items which are not recommended • Items which users are interested in (False Negative, FN) • Items which users are not interested in (True Negative, TN) – Precision P = TP/ TP+ FP • # of correct recommendation/# of all recommended items – Recall R = TP / TP+FN • # of correct recommendation/# of preferred items – F-measure F = 2* P* R / P+R • Harmonic average between precision and recall
  • 31. Experimental Results (1/3) • Precisions [Number of similar users] [Number of recommended items] A: Listening habits-based approach B: Tag-based approach C: Hybrid approach
  • 32. Experimental Results (2/3) • Recalls [Number of similar users] [Number of recommended items] A: Listening habits-based approach B: Tag-based approach C: Hybrid approach
  • 33. Experimental Results (3/3) • F-measure [Number of similar users] [Number of recommended items] A: Listening habits-based approach B: Tag-based approach C: Hybrid approach
  • 34. Statistical Validation • One-way ANOVA about three groups – Method1: listening habits-based approach – Method2: tag-based approach – Method3: hybrid approach • Tukey Multiple Comparison Test – Asymmetric distributions • Log transformation – Different characters in case two groups have significant difference
  • 35. Method 1 2 3 F Mean of log(prec) -3.962B -4.036B -2.879A 34.27*** Mean Precision(SD) 0.020 (0.006) 0.020 (0.009) 0.068 (0.040) N 24 24 24 Method 1 2 3 F Mean of log(recall) -3.285B -4.099c -2.635A 26.80*** Mean Recall (SD) 0.044 (0.023) 0.019 (0.010) 0.093 (0.056) N 24 24 24 <Table1. test for precision> ***: p<0.001 <Table2. test for recall> ***:p<0.001 Method 1 2 3 F Mean of log(F-measure) -3.748B -4.117c -2.894A 41.31*** Mean F-measure (SD) 0.024 (0.006) 0.018 (0.008) 0.06 (0.034) N 24 24 24 <Table2. test for F-measure> ***: p<0.001
  • 36. Related Work • MusicBox – A personalized music recommender system based on social tags – 3-order tensors model – The method improves the recommendation quality • Foafing the music – Collecting music information in a semantic web environment – User information, music information, concert information – Recommendation of similar music items • OntoEmotions – An ontology of emotional categories covering the basic emotions – Armeteo art portal – New relations can be inferred by reasoning on the ontology of emotions
  • 37. Conclusions • Solution to Cold Start Problem – It takes time to collect users’ listening habits. – Adding tags is easily done – Tags look like word-of-mouth • Performance Enhancement – Precision, Recall, F-measure – Hybrid approach > listening habits-based approach, tag-based approach
  • 38. Future Work • Elaborating UniEmotion Ontology – Emerging Internet Slangs • Item Selection – Product Network Analysis Considering Tags – Analyzing short description