Seite 1 
1 
#nowplaying Music Dataset: 
Extracting Listening Behavior from Twitter 
Eva Zangerle, Martin Pichl, Wolfgang G...
Seite 2 
2 
Motivation 
• Evaluation of Music Recommender Systems, Music Information Retrieval 
Systems 
• User study (qua...
Seite 3 
3 
Comparison to other Datasets 
Name Type Entries #Artists #Tracks #Users Upd. 
Celma 1K User Streams 19,150,819...
Seite 4 
4 
Why not use Twitter?
Seite 5 
5 
#nowplaying on Twitter
Seite 6 
6 
Crawling Data from Twitter 
• Crawl public API for #nowplaying, #np, #listeningto 
• Twitter Spritzer 
• Crawl...
Seite 8 
8 
Cleaning the Dataset 
• Examples: 
Sheryl Crow – If It Makes You #nowplaying 
Sheryl Crow – If it Makes You Ha...
Seite 9 
9 
Server 
• Icinga monitoring 
• D2R mapping
Seite 10 
10 
#nowplaying-Dataset 
The University of Innsbruck was founded in 1669 and is one of Austria’s oldest universi...
Seite 11 
11 
Dataset – Extracted Elements 
ListeningEvents 
(Tweets) 
Geo-information 
Tracks 
Artists 
MusicBrainz 
User...
Seite 12 
12 
Dataset Overview (as of 2014/10/31) 
Name Number 
ListeningEvents 57,963,410 
Tracks distinct 1,429,627 
Art...
Seite 13 
13 
Longtail Distributions
Seite 14 
14 
Longtail Distributions
Seite 15 
15 
Artists and Sources 
Top-10 Artists 
Rihanna 
Coldplay 
Taylor Swift 
Bruno Mars 
One Direction 
Maroon 5 
A...
Seite 16 
16 
Last.fm Tags & Genres
Seite 17 
17 
Comparison to other Datasets 
Name Type Entries #Artists #Tracks #Users Upd. 
#nowplaying User Streams 57,96...
Seite 18 
18 
Accessing the Dataset 
The University of Innsbruck was founded in 1669 and is one of Austria’s oldest univer...
Seite 19 
19 
Access to the Dataset 
• dbis-nowplaying.uibk.ac.at 
• HTML View 
• SPARQL Endpoint 
• RDF Dump 
• RDF Onlin...
Seite 20 
20 
Access to the Dataset
Seite 21 
21 
Conclusion 
• dbis-nowplaying.uibk.ac.at 
• Steadily growing dataset 
• Available freely via API 
• Open pro...
Seite 22 
22 
Interested in working with us? 
Questions? 
Contact and Social Media 
@eva_zangerle 
eva.zangerle@uibk.ac.at...
#nowplaying Music Dataset:Extracting Listening Behavior from Twitter
Próxima SlideShare
Cargando en…5
×

#nowplaying Music Dataset: Extracting Listening Behavior from Twitter

671 visualizaciones

Publicado el

#nowplaying Music Dataset: Extracting Listening Behavior from Twitter

Publicado en: Ciencias
0 comentarios
0 recomendaciones
Estadísticas
Notas
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Sin descargas
Visualizaciones
Visualizaciones totales
671
En SlideShare
0
De insertados
0
Número de insertados
13
Acciones
Compartido
0
Descargas
2
Comentarios
0
Recomendaciones
0
Insertados 0
No insertados

No hay notas en la diapositiva.

#nowplaying Music Dataset: Extracting Listening Behavior from Twitter

  1. 1. Seite 1 1 #nowplaying Music Dataset: Extracting Listening Behavior from Twitter Eva Zangerle, Martin Pichl, Wolfgang Gassler, Günther Specht The University of Innsbruck was founded in 1669 and is one of Austria’s oldest universities. Today, with over 28.000 students and 4.000 staff, it is western Austria’s largest institution of higher education and research. For further information visit: www.uibk.ac.at.
  2. 2. Seite 2 2 Motivation • Evaluation of Music Recommender Systems, Music Information Retrieval Systems • User study (qualitative) • Automatic evaluation (quantitative) • Evaluation dataset requirements • Up-to-date • Large size (sparsity!) • Publicly available • Facilitation of social media data hardly considered for such evaluations [Schedl et al., Bertin-Mathieux et al.]
  3. 3. Seite 3 3 Comparison to other Datasets Name Type Entries #Artists #Tracks #Users Upd. Celma 1K User Streams 19,150,819 174,090 1,0,84,865 992  Celma 360K User Streams 17,559,530 292,557 --- 359,349  MMTD User Streams 1,086,808 25,060 133,968 15,735  MSD Audio 1,000,000 44,745 1,000,000 ---  MusicMicro User Streams 594,306 19,529 71,400 136,866  HetRec Ratings 92,384 17,632 --- 1,892  Yahoo! Ratings 717,872,016 9,441 136,735 1,800,000 
  4. 4. Seite 4 4 Why not use Twitter?
  5. 5. Seite 5 5 #nowplaying on Twitter
  6. 6. Seite 6 6 Crawling Data from Twitter • Crawl public API for #nowplaying, #np, #listeningto • Twitter Spritzer • Crawling since 2011/07/11 • 140 million raw tweets Quality? Reference Dataset  Link with other sources?
  7. 7. Seite 8 8 Cleaning the Dataset • Examples: Sheryl Crow – If It Makes You #nowplaying Sheryl Crow – If it Makes You Happy http://t.co/qNr8zeoQTj <-- LIKE THE FACEBOOK PAGE!!! #teamfollowback #follow #instagram #nowplaying @NickiMinaj #PinkFridayTour #Setlist NICKI MINAJ - Pink Friday Tour http://t.co/ifGX8BJ11D #NowPlaying Solution: match with MusicBrainz
  8. 8. Seite 9 9 Server • Icinga monitoring • D2R mapping
  9. 9. Seite 10 10 #nowplaying-Dataset The University of Innsbruck was founded in 1669 and is one of Austria’s oldest universities. Today, with over 28.000 students and 4.000 staff, it is western Austria’s largest institution of higher education and research. For further information visit: www.uibk.ac.at.
  10. 10. Seite 11 11 Dataset – Extracted Elements ListeningEvents (Tweets) Geo-information Tracks Artists MusicBrainz User (Hash) Timestamp Source
  11. 11. Seite 12 12 Dataset Overview (as of 2014/10/31) Name Number ListeningEvents 57,963,410 Tracks distinct 1,429,627 Artists distinct 149,765 Users distinct 4,809,337 Avg. LE per user 12 (SD=680.13, M=1) Avg. LE per track 40 (SD=606, M=3) Avg. LE per artist 388 (SD=3844, M=8) Avg. new listeningEvents per day 64,278 (SD=70302, M=15831)
  12. 12. Seite 13 13 Longtail Distributions
  13. 13. Seite 14 14 Longtail Distributions
  14. 14. Seite 15 15 Artists and Sources Top-10 Artists Rihanna Coldplay Taylor Swift Bruno Mars One Direction Maroon 5 Adele Drake Katy Perry Eminem Top-10 Sources Securenet Systems Radio Playlist Update Spotify Web Twitter for iPhone SAM Broadcaster Song Info Twitter for Android iOS BigURL Twitter for Blackberry Now Playing
  15. 15. Seite 16 16 Last.fm Tags & Genres
  16. 16. Seite 17 17 Comparison to other Datasets Name Type Entries #Artists #Tracks #Users Upd. #nowplaying User Streams 57,963,410 149,765 1,429,627 4,809,337  Celma 1K User Streams 19,150,819 174,090 1,0,84,865 992  Celma 360K User Streams 17,559,530 292,557 --- 359,349  MMTD User Streams 1,086,808 25,060 133,968 15,735  MSD Audio 1,000,000 44,745 1,000,000 ---  MusicMicro User Streams 594,306 19,529 71,400 136,866  HetRec Ratings 92,384 17,632 --- 1,892  Yahoo! Ratings 717,872,016 9,441 136,735 1,800,000 
  17. 17. Seite 18 18 Accessing the Dataset The University of Innsbruck was founded in 1669 and is one of Austria’s oldest universities. Today, with over 28.000 students and 4.000 staff, it is western Austria’s largest institution of higher education and research. For further information visit: www.uibk.ac.at.
  18. 18. Seite 19 19 Access to the Dataset • dbis-nowplaying.uibk.ac.at • HTML View • SPARQL Endpoint • RDF Dump • RDF Online Browser • Online SPARQL Query Interface
  19. 19. Seite 20 20 Access to the Dataset
  20. 20. Seite 21 21 Conclusion • dbis-nowplaying.uibk.ac.at • Steadily growing dataset • Available freely via API • Open problems: • Only 30% resolvable against MusicBrainz • No rating involved • Which further information do you need? • Further interfaces?
  21. 21. Seite 22 22 Interested in working with us? Questions? Contact and Social Media @eva_zangerle eva.zangerle@uibk.ac.at http://www.evazangerle.at http://dbis-informatik.uibk.ac.at @dbisibk https://www.facebook.com/dbisibk

×