SlideShare a Scribd company logo
1 of 19
Download to read offline
Predicting the future of music
Machine Learning
Data Analysis
Music
Fabian Jetzinger Florian Huemer
|192
Content
›› Team
›› Big Data
›› Machine Learning
›› Project goals and solution
›› Project realization
›› Evaluation
|193
›› Project Team
•• Florian Huemer - HTBLA Grieskirchen
•• Fabian Jetzinger - HTBLA Grieskirchen
›› Tutor
•• Dipl.-Inf. Torsten Welsch - HTBLA Grieskirchen
›› Cooperation partner
•• Assoc. Univ.-Prof. Dr. Markus Schedl
•• Johannes Kepler University Linz - Department of Computational Perception
Team
|194
›› Large amounts of data are being collected
›› Everything that is done on the internet is recorded
›› Raw data is virtually useless without analysis and processing
›› Manual analysis is impossible or too expensive
›› Conventional methods of data analysis are not suitable
Big data - situation
|195
›› Machine Learning algorithms
•• Analyse data more efficiently
•• Recognise reoccurring patterns automatically
•• Allow prediction of future developments
›› Already widely in use
•• Prediction of stock prices
•• Improvement of medical diagnoses
•• Self-driving cars
•• Etc.
›› Could lead to human-like artificial intelligence in the future
Big data - solution
|196
›› Supervised Learning (used in this project)
•• Input- and output-variables are known for training data
•• Output can be predicted for any newly given input
›› Unsupervised learning
•• Only input is used for training
•• Algorithm looks for patterns or clusters
›› Reinforcement learning
•• Algorithm can perform several actions
•• Behaviour is rated by a function to determine a score
•• Algorithm changes his behaviour to maximise score
Methods of Machine Learning
|197
›› Develop a system that predicts a musician‘s popularity
›› Based on data collected from last.fm
›› Create predictions using different Machine Learning technologies
›› Continually improve predictions based on analysis
›› Compare the outcome of implemented technologies
›› Analyse and visualise results
Project goals
|198
›› LFM-1b dataset provided by Markus Schedl
•• http://www.cp.jku.at/datasets/LFM-1b/
›› Creation of a python application
•• Libraries scikit-learn and matplotlib
›› Machine Learning algorithms
•• Linear Regression
•• Support Vector Machines
•• Neural Networks
Project solution
|199
›› Listening Event
•• One user
•• Listened to one song
•• By one musician
•• At a specific time
›› Popularity of a musician
•• The number of Listening Events a musician accumulates per day
Definitions
|1910
›› Aggregation of data and transfer into custom database structure
›› Development of Machine Learning components
•• Linear Regression
•• Support Vector Machines (Epsilon-Support Vector Regression)
•• Neural Networks
›› Analysis and visualisation of results
›› Creation of a written diploma thesis
Project realization
|1911
›› Fairly simple technique based on the method of least squares
›› First step towards creating accurate predictions
›› Prediction of The Beatles‘ popularity
•• Red – actual data
•• Blue – prediction
Linear Regression
|1912
›› Usually used for classification (Support Vector Classification)
›› Different kernels and parameters
•• Fine-tuning to optimize results
›› Kernels are based on mathematical functions
•• Linear function
•• Polynomial function
•• Radial basis function
•• Sigmoid function
Support Vector Machines
|1913
›› Adaptation of Support Vector Machine to handle regression tasks
›› Parameter Epsilon determines
acceptable margin of error
›› Parameter C determines penalty
for deviations
›› Parameter gamma controls
influence of individual data points
Epsilon-Support Vector Regression
|1914
›› Radial basis (RBF) kernel
Support Vector Regression - results
›› Polynomial kernel (2nd
degree)
|1915
›› Popular in various areas of data science
›› Basic idea: mimic the human brain
›› Network of several interconnected layers
›› Each layer has numerous nodes (“neurons“)
Neural Networks
|1916
›› Nodes process input based on weight and bias values
›› Result is passed on to next nodes
›› Solver function automatically adjusts weight and bias of nodes
›› Nodes are based on mathematical functions
›› Different function types
•• Linear function
•• Logistic function
•• Hyperbolic function
•• Rectify function
Neural Networks - mechanics
|1917
›› Linear function produces best results
›› Example: prediction of Adele‘s popularity in 2011
Neural Networks - results
|1918
›› Amount of available data limits accuracy of predictions
›› Predictions over longer timespans severely impact accuracy
›› All three methods generate fairly plausible results
›› Possible further improvements
•• Combine data from several different sources (last.fm, Spotify, youtube, ...)
•• Observe social media presence (e.g. recent twitter posts using #bandname)
•• Take new song releases into account
Evaluation
|1919
If you have any questions, feel free to contact us at
fjetzingerSCI@gmx.at
Thank you for your support!
Machine Learning Data Analysis Music

More Related Content

Similar to Predicting the future of music #scichallenge2017

TAL17 - Alois Mahr, Zollner Elektronik AG
TAL17 - Alois Mahr, Zollner Elektronik AGTAL17 - Alois Mahr, Zollner Elektronik AG
TAL17 - Alois Mahr, Zollner Elektronik AGZdeněk Eliáš
 
IST 110 Presentation 1.pptx
IST 110 Presentation 1.pptxIST 110 Presentation 1.pptx
IST 110 Presentation 1.pptxSamBrown793837
 
Analyzing Census Data: Large databases and challenges to statistical softwares
Analyzing Census Data: Large databases and challenges to statistical softwaresAnalyzing Census Data: Large databases and challenges to statistical softwares
Analyzing Census Data: Large databases and challenges to statistical softwaresRogério Barbosa
 
EMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniquesEMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniquesPiet J.H. Daas
 
Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Srinath Perera
 
RIPE NCC Operations and Analysis Tools
RIPE NCC Operations and Analysis ToolsRIPE NCC Operations and Analysis Tools
RIPE NCC Operations and Analysis ToolsRIPE NCC
 
Embodied Data Objects
Embodied Data ObjectsEmbodied Data Objects
Embodied Data ObjectsManas Tungare
 
01IntroductiontoInformationTechnology_special.ppt
01IntroductiontoInformationTechnology_special.ppt01IntroductiontoInformationTechnology_special.ppt
01IntroductiontoInformationTechnology_special.pptAziziMtumwaIddi
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysAerospike, Inc.
 
Lecture-1 Introduction to Information Technology .ppt
Lecture-1 Introduction to Information Technology  .pptLecture-1 Introduction to Information Technology  .ppt
Lecture-1 Introduction to Information Technology .pptTahirihrani Muniandy
 
Continuous and Parallel LiDAR Point-cloud Clustering
Continuous and Parallel LiDAR Point-cloud ClusteringContinuous and Parallel LiDAR Point-cloud Clustering
Continuous and Parallel LiDAR Point-cloud ClusteringHannaneh Najdataei
 
Sensor Data in Business
Sensor Data in BusinessSensor Data in Business
Sensor Data in BusinessNiko Vuokko
 
Role of IT in Research: How to improve productivity of Research Process
Role of IT in Research: How to improve productivity of Research ProcessRole of IT in Research: How to improve productivity of Research Process
Role of IT in Research: How to improve productivity of Research ProcessSHARAD JAMBUKAR
 
[DSC Europe 22] Make some noise for AI in JavaScript - Sead Delalic
[DSC Europe 22] Make some noise for AI in JavaScript - Sead Delalic[DSC Europe 22] Make some noise for AI in JavaScript - Sead Delalic
[DSC Europe 22] Make some noise for AI in JavaScript - Sead DelalicDataScienceConferenc1
 
computer organisation architecture.pptx
computer organisation architecture.pptxcomputer organisation architecture.pptx
computer organisation architecture.pptxYaqubMd
 
information processing by peter nothon chapter 1B
information processing by peter nothon chapter 1Binformation processing by peter nothon chapter 1B
information processing by peter nothon chapter 1BSyed Arslan Rizvi
 
Presentación GPUs MAEB 2012
Presentación GPUs MAEB 2012Presentación GPUs MAEB 2012
Presentación GPUs MAEB 2012gustavo_romero
 

Similar to Predicting the future of music #scichallenge2017 (20)

TAL17 - Alois Mahr, Zollner Elektronik AG
TAL17 - Alois Mahr, Zollner Elektronik AGTAL17 - Alois Mahr, Zollner Elektronik AG
TAL17 - Alois Mahr, Zollner Elektronik AG
 
IST 110 Presentation 1.pptx
IST 110 Presentation 1.pptxIST 110 Presentation 1.pptx
IST 110 Presentation 1.pptx
 
Analyzing Census Data: Large databases and challenges to statistical softwares
Analyzing Census Data: Large databases and challenges to statistical softwaresAnalyzing Census Data: Large databases and challenges to statistical softwares
Analyzing Census Data: Large databases and challenges to statistical softwares
 
Analyzing the census
Analyzing the censusAnalyzing the census
Analyzing the census
 
EMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniquesEMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniques
 
Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack
 
RIPE NCC Operations and Analysis Tools
RIPE NCC Operations and Analysis ToolsRIPE NCC Operations and Analysis Tools
RIPE NCC Operations and Analysis Tools
 
Embodied Data Objects
Embodied Data ObjectsEmbodied Data Objects
Embodied Data Objects
 
01IntroductiontoInformationTechnology_special.ppt
01IntroductiontoInformationTechnology_special.ppt01IntroductiontoInformationTechnology_special.ppt
01IntroductiontoInformationTechnology_special.ppt
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California Highways
 
Lecture-1 Introduction to Information Technology .ppt
Lecture-1 Introduction to Information Technology  .pptLecture-1 Introduction to Information Technology  .ppt
Lecture-1 Introduction to Information Technology .ppt
 
Continuous and Parallel LiDAR Point-cloud Clustering
Continuous and Parallel LiDAR Point-cloud ClusteringContinuous and Parallel LiDAR Point-cloud Clustering
Continuous and Parallel LiDAR Point-cloud Clustering
 
Sensor Data in Business
Sensor Data in BusinessSensor Data in Business
Sensor Data in Business
 
Role of IT in Research: How to improve productivity of Research Process
Role of IT in Research: How to improve productivity of Research ProcessRole of IT in Research: How to improve productivity of Research Process
Role of IT in Research: How to improve productivity of Research Process
 
[DSC Europe 22] Make some noise for AI in JavaScript - Sead Delalic
[DSC Europe 22] Make some noise for AI in JavaScript - Sead Delalic[DSC Europe 22] Make some noise for AI in JavaScript - Sead Delalic
[DSC Europe 22] Make some noise for AI in JavaScript - Sead Delalic
 
computer organisation architecture.pptx
computer organisation architecture.pptxcomputer organisation architecture.pptx
computer organisation architecture.pptx
 
Machine Learning Overview: How did we get here ?
Machine Learning Overview: How did we get here ?Machine Learning Overview: How did we get here ?
Machine Learning Overview: How did we get here ?
 
Iwann2011 gpus
Iwann2011 gpusIwann2011 gpus
Iwann2011 gpus
 
information processing by peter nothon chapter 1B
information processing by peter nothon chapter 1Binformation processing by peter nothon chapter 1B
information processing by peter nothon chapter 1B
 
Presentación GPUs MAEB 2012
Presentación GPUs MAEB 2012Presentación GPUs MAEB 2012
Presentación GPUs MAEB 2012
 

Recently uploaded

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 

Recently uploaded (20)

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 

Predicting the future of music #scichallenge2017

  • 1. Predicting the future of music Machine Learning Data Analysis Music Fabian Jetzinger Florian Huemer
  • 2. |192 Content ›› Team ›› Big Data ›› Machine Learning ›› Project goals and solution ›› Project realization ›› Evaluation
  • 3. |193 ›› Project Team •• Florian Huemer - HTBLA Grieskirchen •• Fabian Jetzinger - HTBLA Grieskirchen ›› Tutor •• Dipl.-Inf. Torsten Welsch - HTBLA Grieskirchen ›› Cooperation partner •• Assoc. Univ.-Prof. Dr. Markus Schedl •• Johannes Kepler University Linz - Department of Computational Perception Team
  • 4. |194 ›› Large amounts of data are being collected ›› Everything that is done on the internet is recorded ›› Raw data is virtually useless without analysis and processing ›› Manual analysis is impossible or too expensive ›› Conventional methods of data analysis are not suitable Big data - situation
  • 5. |195 ›› Machine Learning algorithms •• Analyse data more efficiently •• Recognise reoccurring patterns automatically •• Allow prediction of future developments ›› Already widely in use •• Prediction of stock prices •• Improvement of medical diagnoses •• Self-driving cars •• Etc. ›› Could lead to human-like artificial intelligence in the future Big data - solution
  • 6. |196 ›› Supervised Learning (used in this project) •• Input- and output-variables are known for training data •• Output can be predicted for any newly given input ›› Unsupervised learning •• Only input is used for training •• Algorithm looks for patterns or clusters ›› Reinforcement learning •• Algorithm can perform several actions •• Behaviour is rated by a function to determine a score •• Algorithm changes his behaviour to maximise score Methods of Machine Learning
  • 7. |197 ›› Develop a system that predicts a musician‘s popularity ›› Based on data collected from last.fm ›› Create predictions using different Machine Learning technologies ›› Continually improve predictions based on analysis ›› Compare the outcome of implemented technologies ›› Analyse and visualise results Project goals
  • 8. |198 ›› LFM-1b dataset provided by Markus Schedl •• http://www.cp.jku.at/datasets/LFM-1b/ ›› Creation of a python application •• Libraries scikit-learn and matplotlib ›› Machine Learning algorithms •• Linear Regression •• Support Vector Machines •• Neural Networks Project solution
  • 9. |199 ›› Listening Event •• One user •• Listened to one song •• By one musician •• At a specific time ›› Popularity of a musician •• The number of Listening Events a musician accumulates per day Definitions
  • 10. |1910 ›› Aggregation of data and transfer into custom database structure ›› Development of Machine Learning components •• Linear Regression •• Support Vector Machines (Epsilon-Support Vector Regression) •• Neural Networks ›› Analysis and visualisation of results ›› Creation of a written diploma thesis Project realization
  • 11. |1911 ›› Fairly simple technique based on the method of least squares ›› First step towards creating accurate predictions ›› Prediction of The Beatles‘ popularity •• Red – actual data •• Blue – prediction Linear Regression
  • 12. |1912 ›› Usually used for classification (Support Vector Classification) ›› Different kernels and parameters •• Fine-tuning to optimize results ›› Kernels are based on mathematical functions •• Linear function •• Polynomial function •• Radial basis function •• Sigmoid function Support Vector Machines
  • 13. |1913 ›› Adaptation of Support Vector Machine to handle regression tasks ›› Parameter Epsilon determines acceptable margin of error ›› Parameter C determines penalty for deviations ›› Parameter gamma controls influence of individual data points Epsilon-Support Vector Regression
  • 14. |1914 ›› Radial basis (RBF) kernel Support Vector Regression - results ›› Polynomial kernel (2nd degree)
  • 15. |1915 ›› Popular in various areas of data science ›› Basic idea: mimic the human brain ›› Network of several interconnected layers ›› Each layer has numerous nodes (“neurons“) Neural Networks
  • 16. |1916 ›› Nodes process input based on weight and bias values ›› Result is passed on to next nodes ›› Solver function automatically adjusts weight and bias of nodes ›› Nodes are based on mathematical functions ›› Different function types •• Linear function •• Logistic function •• Hyperbolic function •• Rectify function Neural Networks - mechanics
  • 17. |1917 ›› Linear function produces best results ›› Example: prediction of Adele‘s popularity in 2011 Neural Networks - results
  • 18. |1918 ›› Amount of available data limits accuracy of predictions ›› Predictions over longer timespans severely impact accuracy ›› All three methods generate fairly plausible results ›› Possible further improvements •• Combine data from several different sources (last.fm, Spotify, youtube, ...) •• Observe social media presence (e.g. recent twitter posts using #bandname) •• Take new song releases into account Evaluation
  • 19. |1919 If you have any questions, feel free to contact us at fjetzingerSCI@gmx.at Thank you for your support! Machine Learning Data Analysis Music