An explanation of machine learning for business

MK99 – Big Data
1
Big data & cross-platform analytics
MOOC lectures Pr. Clement Levallois

MK99 – Big Data
2
A short note on machine learning for business

MK99 – Big Data
3
Machine Learning
• Family of techniques to formulate predictions, based on data
•Why is it called Machine learning?
–Machine: it is about algorithms running on computers, not equations solved with pen and paper
–Learning: the algorithms start with zero accuracy. Then, they get more accurate while being fed with data: the algorithm refines its parameters, it “learns”.

MK99 – Big Data
4
Typical set up
1.We start with a training set
Data already collected: we know the actual values to be found
Ex: a list of consumers, their characteristics and their associated credit score
2.The algorithms are trained on this set
-> A series of algorithms run on the training set. Their parameters get adjusted so that the actual values get progressively predicted the most accurately possible.
3.A test set (“fresh data”) is brought
-> List of consumer characteristics. Their credit score is known but hidden.
4.Running the trained algo on the test set
-> Predict the credit score for each consumer in the test set, using the algorithms that were trained on phase 1
5.A measure of accuracy
- Given the correct values to be predicted in the test set, how accurate were the algorithms?
-> Where the credit scores accurately predicted?
Actual values

MK99 – Big Data
5
Vocabulary
•Data scientists “train” their model and then test it
•They are concerned by “out-of-sample” prediction
–The fact that their model predicts accurately data points in the training set (the “sample”) is trivial
–This is the accuracy on the test set that matters!
–This is called an “out-of-sample” prediction

MK99 – Big Data
6
Why is machine learning (ML) so different from statistics?
•ML does not focus on causality – just prediction!
–Note: for this reason, ML cannot predict the effect of intervention - it has no causal model.
•ML has a special concern for out-of-sample prediction
–Will be especially careful about over-fitting
•ML picks its algorithms from diff academic disciplines
–Text, network relations, clustering, not just traditional statistics
•Coming from comput. sciences, ML has affinities with big data
–Procedures optimized for speed and scale
But the best data scientists often started as statisticians / econometricians:
See Hal Varian: Chief Economist at Google

MK99 – Big Data
7
•Kaggle is a website hosting ML competitions, anybody can join
•Goal: make the best prediction on a dataset, with cash prizes
•From predicting clicks on ads to epileptic seizures
•Always the same setup: a training set, a test set, a scoring based on accuracy.

MK99 – Big Data
8
This slide presentation is part of a course offered by EMLYON Business School (www.em-lyon.com)
Contact Clement Levallois (levallois [at] em-lyon.com) for more information.

An explanation of machine learning for business

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (9)

Similar to An explanation of machine learning for business

Similar to An explanation of machine learning for business (20)

More from Clement Levallois

More from Clement Levallois (13)

Recently uploaded

Recently uploaded (20)

An explanation of machine learning for business