The task of our diploma project was the creation of a system capable of predicting the future development of a musician's popularity. To achieve this goal, we used different machine learning technologies, namely linear regression, support vector machines and neural networks, to process the vast amount of data needed to successfully generate accurate predictions. Our end goal was the analysis and visualisation of the accomplished results, as illustrated in this presentation.
#scichallenge2017
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Predicting the future of music #scichallenge2017
1. Predicting the future of music
Machine Learning
Data Analysis
Music
Fabian Jetzinger Florian Huemer
2. |192
Content
›› Team
›› Big Data
›› Machine Learning
›› Project goals and solution
›› Project realization
›› Evaluation
3. |193
›› Project Team
•• Florian Huemer - HTBLA Grieskirchen
•• Fabian Jetzinger - HTBLA Grieskirchen
›› Tutor
•• Dipl.-Inf. Torsten Welsch - HTBLA Grieskirchen
›› Cooperation partner
•• Assoc. Univ.-Prof. Dr. Markus Schedl
•• Johannes Kepler University Linz - Department of Computational Perception
Team
4. |194
›› Large amounts of data are being collected
›› Everything that is done on the internet is recorded
›› Raw data is virtually useless without analysis and processing
›› Manual analysis is impossible or too expensive
›› Conventional methods of data analysis are not suitable
Big data - situation
5. |195
›› Machine Learning algorithms
•• Analyse data more efficiently
•• Recognise reoccurring patterns automatically
•• Allow prediction of future developments
›› Already widely in use
•• Prediction of stock prices
•• Improvement of medical diagnoses
•• Self-driving cars
•• Etc.
›› Could lead to human-like artificial intelligence in the future
Big data - solution
6. |196
›› Supervised Learning (used in this project)
•• Input- and output-variables are known for training data
•• Output can be predicted for any newly given input
›› Unsupervised learning
•• Only input is used for training
•• Algorithm looks for patterns or clusters
›› Reinforcement learning
•• Algorithm can perform several actions
•• Behaviour is rated by a function to determine a score
•• Algorithm changes his behaviour to maximise score
Methods of Machine Learning
7. |197
›› Develop a system that predicts a musician‘s popularity
›› Based on data collected from last.fm
›› Create predictions using different Machine Learning technologies
›› Continually improve predictions based on analysis
›› Compare the outcome of implemented technologies
›› Analyse and visualise results
Project goals
8. |198
›› LFM-1b dataset provided by Markus Schedl
•• http://www.cp.jku.at/datasets/LFM-1b/
›› Creation of a python application
•• Libraries scikit-learn and matplotlib
›› Machine Learning algorithms
•• Linear Regression
•• Support Vector Machines
•• Neural Networks
Project solution
9. |199
›› Listening Event
•• One user
•• Listened to one song
•• By one musician
•• At a specific time
›› Popularity of a musician
•• The number of Listening Events a musician accumulates per day
Definitions
10. |1910
›› Aggregation of data and transfer into custom database structure
›› Development of Machine Learning components
•• Linear Regression
•• Support Vector Machines (Epsilon-Support Vector Regression)
•• Neural Networks
›› Analysis and visualisation of results
›› Creation of a written diploma thesis
Project realization
11. |1911
›› Fairly simple technique based on the method of least squares
›› First step towards creating accurate predictions
›› Prediction of The Beatles‘ popularity
•• Red – actual data
•• Blue – prediction
Linear Regression
12. |1912
›› Usually used for classification (Support Vector Classification)
›› Different kernels and parameters
•• Fine-tuning to optimize results
›› Kernels are based on mathematical functions
•• Linear function
•• Polynomial function
•• Radial basis function
•• Sigmoid function
Support Vector Machines
13. |1913
›› Adaptation of Support Vector Machine to handle regression tasks
›› Parameter Epsilon determines
acceptable margin of error
›› Parameter C determines penalty
for deviations
›› Parameter gamma controls
influence of individual data points
Epsilon-Support Vector Regression
15. |1915
›› Popular in various areas of data science
›› Basic idea: mimic the human brain
›› Network of several interconnected layers
›› Each layer has numerous nodes (“neurons“)
Neural Networks
16. |1916
›› Nodes process input based on weight and bias values
›› Result is passed on to next nodes
›› Solver function automatically adjusts weight and bias of nodes
›› Nodes are based on mathematical functions
›› Different function types
•• Linear function
•• Logistic function
•• Hyperbolic function
•• Rectify function
Neural Networks - mechanics
17. |1917
›› Linear function produces best results
›› Example: prediction of Adele‘s popularity in 2011
Neural Networks - results
18. |1918
›› Amount of available data limits accuracy of predictions
›› Predictions over longer timespans severely impact accuracy
›› All three methods generate fairly plausible results
›› Possible further improvements
•• Combine data from several different sources (last.fm, Spotify, youtube, ...)
•• Observe social media presence (e.g. recent twitter posts using #bandname)
•• Take new song releases into account
Evaluation
19. |1919
If you have any questions, feel free to contact us at
fjetzingerSCI@gmx.at
Thank you for your support!
Machine Learning Data Analysis Music