SlideShare una empresa de Scribd logo
1 de 31
Descargar para leer sin conexión
1st edition | July 8-11, 2019
BigML, Inc #DutchMLSchool 2
ML: Business Perspective
A Gentle Introduction to Machine Learning
Charles Parker
VP, Machine Learning Algorithms
BigML, Inc #DutchMLSchool 3
In This Talk
• A simple introduction to supervised machine learning
• An introduction to some of the core concepts of the BigML
platform
• A tiny peek behind the curtain to see what really happens when
ML algorithms learn a model
• Ways to evaluate and interpret your model’s predictions
BigML, Inc #DutchMLSchool 4
A Churn Problem
• You are the CEO of a mobile phone
company (congratulations!)
• Some percentage of your customers
leave the service (or “churn” every
month)
• You have a budget to reach out to
some customers each month to try to
persuade them to stay with the service
(with, for example, incentives)
• But to do that, you need to find out
who those customers are
BigML, Inc #DutchMLSchool 5
Begin with the End In Mind
• Currently, you have a simple targeting strategy designed by
hand that identifies the 10,000 most likely customers to churn
• For every five people you call, two are actually thinking about
leaving (4,000 for a precision of 40%)
• Of these customers, your operators can convince half to stay (so
about 2000)
• Each of these saved customers has a net value of $500
• What if you could increase the precision of your targeting to
50%?
BigML, Inc #DutchMLSchool 6
You Have The Data!
Minutes Used Last Month’s
Bill
Calls To
Support
Website Visits Churn?
104 103,60 0 0 No
124 56,33 1 0 No
56 214,60 2 0 Yes
2410 305,60 0 5 No
536 145,70 0 0 No
234 122,09 0 1 No
201 185,76 1 7 Yes
111 83,60 3 2 No
BigML, Inc #DutchMLSchool 7
Now . . . Magic!
• Can we use this data to create a
better targeting strategy?
(Spoiler: Yes!)
• Can we use the very same data
to measure the effectiveness of
that strategy? (Spoiler: Yes!)
• And how do we do that? (Spoiler:
MACHINE LEARNING)
BigML, Inc #DutchMLSchool 8
Aside: BigML Resources
• We can now upload that data to BigML
• Everything created on BigML is a resource
• Resources are:
• Mostly immutable: You can’t “screw them up”
• Assigned a unique ID
• Always available via both the API and the UI
• Working with BigML is a process of creating resources
BigML, Inc #DutchMLSchool 9
Data Sources @ BigML
• A data source is a raw data file that you upload to the BigML
platform
• We make some initial guesses about the number and type of
columns in the file, and a bit about their content (such as the
language for text fields)
• Data can come from uploaded CSVs, Google drive, dropbox, a
random URL, and so on
BigML, Inc #DutchMLSchool 10
Datasets @ BigML
• A BigML dataset represents processed row-column data
• We’ve made a final determination of the number and type of
columns in the source
• Some summary stats have been calculated for each column
BigML, Inc #DutchMLSchool 11
Supervised Machine Learning
• Collect training data from the past about
your prediction problem, including the
right answer (e.g., statistics for each
customer month and whether or not the
customer churned at the end of that
month)
• Feed that data to a machine learning
algorithm
• The algorithm creates a program (that
we typically call a model, or classifier or
predictor) which can make that
prediction for you on future data
BigML, Inc #DutchMLSchool 12
Traditional: Expert and Programmer
• Machine learning breaks the expert system
paradigm
• To make an expert software system before
machine learning, you used an expert and
a programmer
• The expert’s job was to know how the system
should work and be able to communicate that
knowledge
• The programmer’s job was to convert the expert’s
knowledge into a running computer program
• These could be the same person, but you
must have both of them
BigML, Inc #DutchMLSchool 13
Now: Data and Algorithm
• Instead of an expert we have data
• Data can be easier to get (and is in some cases already there)
• You can get a volume of data much larger than any expert
could possibly see
• Humans are notoriously bad at being good at things:
• https://www.newscientist.com/article/mg21628930-400-specialist-knowledge-is-useless-and-unhelpful
• Instead of a programmer we have a learning algorithm
• Once you have the data in the proper format, learning
algorithms work much faster (enabling iteration)
• Learning algorithms are modular
BigML, Inc #DutchMLSchool 14
Back to the Data
Minutes Used Last Month’s
Bill
Calls To
Support
Website Visits Churn?
104 103,60 0 0 No
124 56,33 1 0 No
56 214,60 2 0 Yes
2410 305,60 0 5 No
536 145,70 0 0 No
234 122,09 0 1 No
201 185,76 1 7 Yes
111 83,60 3 2 No
BigML, Inc #DutchMLSchool 15
The Goal: A Program that Predicts
• The goal of learning is to take this sort of training data and
create a program (a model or classifier or predictor)
• This model takes as input a single row with a value for each of
the columns given in the training data
• The model will output its predicted value for the objective based
on the given column values
• Importantly this row can contain any values for the given
columns, not just the ones seen in the training data
BigML, Inc #DutchMLSchool 16
Just a Little Peek Under the Curtain
BigML, Inc #DutchMLSchool 17
Behind The Scenes
• A learning algorithm is:
• A space of models that can be learned (a hypothesis space)
• A clever way of searching through that space to find a “good”
model
• A good model is one that, for example, makes accurate
predictions on the training data
• So “machine learning” is finding a model amongst all possible
models that has a good “fit” with the training data
BigML, Inc #DutchMLSchool 18
A Simple Hypothesis Space
• Suppose we tell our machine to split the data into two parts
based on some threshold of some feature
• If a data point is on one side of the threshold, we’ll predict the
majority class of all the training points on that side
• We can measure how many points in the training data would be
correctly predicted using this method
• This is how good our “fit” is to the training data
• The best threshold is the one with the best fit (and we will try
them all)
BigML, Inc #DutchMLSchool 19
Back to the Data
Minutes Used Last Month’s
Bill
Calls To
Support
Website Visits Churn?
104 103,60 0 0 No
124 56,33 1 0 No
56 214,60 2 0 Yes
2410 305,60 0 5 No
536 145,70 0 0 No
234 122,09 0 1 No
201 185,76 1 7 Yes
111 83,60 3 2 No
BigML, Inc #DutchMLSchool 20
Minutes Used > 200
Minutes Used Last Month’s
Bill
Calls To
Support
Website Visits Churn?
104 103,60 0 0 No
124 56,33 1 0 No
56 214,60 2 0 Yes
2410 305,60 0 5 No
536 145,70 0 0 No
234 122,09 0 1 No
201 185,76 1 7 Yes
111 83,60 3 2 No
BigML, Inc #DutchMLSchool 21
Website Visits > 0
Minutes Used Last Month’s
Bill
Calls To
Support
Website Visits Churn?
104 103,60 0 0 No
124 56,33 1 0 No
56 214,60 2 0 Yes
2410 305,60 0 5 No
536 145,70 0 0 No
234 122,09 0 1 No
201 185,76 1 7 Yes
111 83,60 3 2 No
BigML, Inc #DutchMLSchool 22
Last Bill > $180
Minutes Used Last Month’s
Bill
Calls To
Support
Website Visits Churn?
104 103,60 0 0 No
124 56,33 1 0 No
56 214,60 2 0 Yes
2410 305,60 0 5 No
536 145,70 0 0 No
234 122,09 0 1 No
201 185,76 1 7 Yes
111 83,60 3 2 No
BigML, Inc #DutchMLSchool 23
So Far, So Good!
• This is basically what machine
learning algorithms do
• Try a solution and see how well it fits
the training data
• If “not well”, take some steps to
“improve” it
• There are many, many different
ways of doing it, but this is
usually what it boils down to
BigML, Inc #DutchMLSchool 24
Evaluating and Improving
BigML, Inc #DutchMLSchool 25
Now What?
• The next thing is to use the training data to test the model
• Split the data into training and test sets (machine learning is very good at
memorizing the data)
• Train a model on the training set
• Evaluate it using the test set (or, the “held out data”)
• We’ll get to the evaluation tool more fully later on
BigML, Inc #DutchMLSchool 26
And Now?
• Is the model good enough?
• If not:
• Different modeling approaches (model types, parameter
tuning)
• Better features (more information, transformations of the
information you already have)
• The more you fiddle with things, the more you contaminate
your results (through overfitting)
• Thus, if it’s “good enough”, it’s often best to leave it alone
BigML, Inc #DutchMLSchool 27
Explanations
BigML, Inc #DutchMLSchool 28
Field Importance
• While our model is good, we don’t really have a good high level
overview of why it thinks what it thinks
• BigML supervised models provide this in the form of field
importance under the model summary report
BigML, Inc #DutchMLSchool 29
Individual Explanations
• Individual predictions can be explained as well (as the model’s
reasoning for a particular point can be different from the model
at large)
• Use the magnifying glass in the prediction form
BigML, Inc #DutchMLSchool 30
Two Takeaways
• When beginning a machine learning project, the more concrete
the goal, the better. Numbers are the lifeblood of analytics so if
you can quantify your objective(s), success is unlikely
• Machine Learning isn’t the right solution for every problem! Be
wary of your algorithm being replaced by a human!
• “Before embarking on an ambitious project, try to kill it.” - Edsgar
Dijkstra
Co-organized by: Sponsor:
Business Partners:

Más contenido relacionado

La actualidad más candente

The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
Data Science Milan
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...
Simplilearn
 

La actualidad más candente (20)

DutchMLSchool. Machine Learning: Why Now?
DutchMLSchool. Machine Learning: Why Now? DutchMLSchool. Machine Learning: Why Now?
DutchMLSchool. Machine Learning: Why Now?
 
DutchMLSchool. Supervised vs Unsupervised Learning
DutchMLSchool. Supervised vs Unsupervised LearningDutchMLSchool. Supervised vs Unsupervised Learning
DutchMLSchool. Supervised vs Unsupervised Learning
 
DutchMLSchool. Models, Evaluations, and Ensembles
DutchMLSchool. Models, Evaluations, and EnsemblesDutchMLSchool. Models, Evaluations, and Ensembles
DutchMLSchool. Models, Evaluations, and Ensembles
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
 
VSSML18 Introduction to Supervised Learning
VSSML18 Introduction to Supervised LearningVSSML18 Introduction to Supervised Learning
VSSML18 Introduction to Supervised Learning
 
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
 
Square's Machine Learning Infrastructure and Applications - Rong Yan
Square's Machine Learning Infrastructure and Applications - Rong YanSquare's Machine Learning Infrastructure and Applications - Rong Yan
Square's Machine Learning Infrastructure and Applications - Rong Yan
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
 
Building Custom
Machine Learning Algorithms
with Apache SystemML
Building Custom
Machine Learning Algorithms
with Apache SystemMLBuilding Custom
Machine Learning Algorithms
with Apache SystemML
Building Custom
Machine Learning Algorithms
with Apache SystemML
 
BSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, Evaluations
 
L11. The Future of Machine Learning
L11. The Future of Machine LearningL11. The Future of Machine Learning
L11. The Future of Machine Learning
 
Building a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZBuilding a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to Z
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...
 
Managing machine learning
Managing machine learningManaging machine learning
Managing machine learning
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Machine learning in action at Pipedrive
Machine learning in action at PipedriveMachine learning in action at Pipedrive
Machine learning in action at Pipedrive
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
 

Similar a DutchMLSchool. ML Business Perspective

Similar a DutchMLSchool. ML Business Perspective (20)

MLSEV. Machine Learning: Business Perspective
MLSEV. Machine Learning: Business PerspectiveMLSEV. Machine Learning: Business Perspective
MLSEV. Machine Learning: Business Perspective
 
Digital analytics lecture1
Digital analytics lecture1Digital analytics lecture1
Digital analytics lecture1
 
Design Like a Pro: Machine Learning Basics
Design Like a Pro: Machine Learning BasicsDesign Like a Pro: Machine Learning Basics
Design Like a Pro: Machine Learning Basics
 
Future of data science as a profession
Future of data science as a professionFuture of data science as a profession
Future of data science as a profession
 
Design Like a Pro: Machine Learning Basics
Design Like a Pro: Machine Learning BasicsDesign Like a Pro: Machine Learning Basics
Design Like a Pro: Machine Learning Basics
 
VSSML18. Advanced WhizzML Workflows
VSSML18. Advanced WhizzML WorkflowsVSSML18. Advanced WhizzML Workflows
VSSML18. Advanced WhizzML Workflows
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptx
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
MLSEV Virtual. Supervised vs Unsupervised
MLSEV Virtual. Supervised vs UnsupervisedMLSEV Virtual. Supervised vs Unsupervised
MLSEV Virtual. Supervised vs Unsupervised
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
 
DutchMLSchool. Your first BigML Project
DutchMLSchool. Your first BigML ProjectDutchMLSchool. Your first BigML Project
DutchMLSchool. Your first BigML Project
 
From c# Into Machine Learning
From c# Into Machine LearningFrom c# Into Machine Learning
From c# Into Machine Learning
 
artificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdfartificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdf
 
BAMarathon_DanielFylstra_Feb25.pptx
BAMarathon_DanielFylstra_Feb25.pptxBAMarathon_DanielFylstra_Feb25.pptx
BAMarathon_DanielFylstra_Feb25.pptx
 
Machine learning for Marketers
Machine learning for MarketersMachine learning for Marketers
Machine learning for Marketers
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
Are you ready for Data science? A 12 point test
Are you ready for Data science? A 12 point testAre you ready for Data science? A 12 point test
Are you ready for Data science? A 12 point test
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning for SEOs - SMXL
Machine Learning for SEOs - SMXLMachine Learning for SEOs - SMXL
Machine Learning for SEOs - SMXL
 
BigMLSchool: Customer Segmentation
BigMLSchool: Customer SegmentationBigMLSchool: Customer Segmentation
BigMLSchool: Customer Segmentation
 

Más de BigML, Inc

Más de BigML, Inc (20)

Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in Manufacturing
 
DutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationDutchMLSchool 2022 - Automation
DutchMLSchool 2022 - Automation
 
DutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceDutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML Compliance
 
DutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesDutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective Anomalies
 
DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector
 
DutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionDutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly Detection
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in ML
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End ML
 
DutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyDutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven Company
 
DutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorDutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal Sector
 
DutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsDutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe Stadiums
 
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsDutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
 
DutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleDutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at Scale
 
DutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIDutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AI
 
Democratizing Object Detection
Democratizing Object DetectionDemocratizing Object Detection
Democratizing Object Detection
 
BigML Release: Image Processing
BigML Release: Image ProcessingBigML Release: Image Processing
BigML Release: Image Processing
 
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureMachine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
 
Machine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorMachine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail Sector
 
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
 

Último

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 

Último (20)

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 

DutchMLSchool. ML Business Perspective

  • 1. 1st edition | July 8-11, 2019
  • 2. BigML, Inc #DutchMLSchool 2 ML: Business Perspective A Gentle Introduction to Machine Learning Charles Parker VP, Machine Learning Algorithms
  • 3. BigML, Inc #DutchMLSchool 3 In This Talk • A simple introduction to supervised machine learning • An introduction to some of the core concepts of the BigML platform • A tiny peek behind the curtain to see what really happens when ML algorithms learn a model • Ways to evaluate and interpret your model’s predictions
  • 4. BigML, Inc #DutchMLSchool 4 A Churn Problem • You are the CEO of a mobile phone company (congratulations!) • Some percentage of your customers leave the service (or “churn” every month) • You have a budget to reach out to some customers each month to try to persuade them to stay with the service (with, for example, incentives) • But to do that, you need to find out who those customers are
  • 5. BigML, Inc #DutchMLSchool 5 Begin with the End In Mind • Currently, you have a simple targeting strategy designed by hand that identifies the 10,000 most likely customers to churn • For every five people you call, two are actually thinking about leaving (4,000 for a precision of 40%) • Of these customers, your operators can convince half to stay (so about 2000) • Each of these saved customers has a net value of $500 • What if you could increase the precision of your targeting to 50%?
  • 6. BigML, Inc #DutchMLSchool 6 You Have The Data! Minutes Used Last Month’s Bill Calls To Support Website Visits Churn? 104 103,60 0 0 No 124 56,33 1 0 No 56 214,60 2 0 Yes 2410 305,60 0 5 No 536 145,70 0 0 No 234 122,09 0 1 No 201 185,76 1 7 Yes 111 83,60 3 2 No
  • 7. BigML, Inc #DutchMLSchool 7 Now . . . Magic! • Can we use this data to create a better targeting strategy? (Spoiler: Yes!) • Can we use the very same data to measure the effectiveness of that strategy? (Spoiler: Yes!) • And how do we do that? (Spoiler: MACHINE LEARNING)
  • 8. BigML, Inc #DutchMLSchool 8 Aside: BigML Resources • We can now upload that data to BigML • Everything created on BigML is a resource • Resources are: • Mostly immutable: You can’t “screw them up” • Assigned a unique ID • Always available via both the API and the UI • Working with BigML is a process of creating resources
  • 9. BigML, Inc #DutchMLSchool 9 Data Sources @ BigML • A data source is a raw data file that you upload to the BigML platform • We make some initial guesses about the number and type of columns in the file, and a bit about their content (such as the language for text fields) • Data can come from uploaded CSVs, Google drive, dropbox, a random URL, and so on
  • 10. BigML, Inc #DutchMLSchool 10 Datasets @ BigML • A BigML dataset represents processed row-column data • We’ve made a final determination of the number and type of columns in the source • Some summary stats have been calculated for each column
  • 11. BigML, Inc #DutchMLSchool 11 Supervised Machine Learning • Collect training data from the past about your prediction problem, including the right answer (e.g., statistics for each customer month and whether or not the customer churned at the end of that month) • Feed that data to a machine learning algorithm • The algorithm creates a program (that we typically call a model, or classifier or predictor) which can make that prediction for you on future data
  • 12. BigML, Inc #DutchMLSchool 12 Traditional: Expert and Programmer • Machine learning breaks the expert system paradigm • To make an expert software system before machine learning, you used an expert and a programmer • The expert’s job was to know how the system should work and be able to communicate that knowledge • The programmer’s job was to convert the expert’s knowledge into a running computer program • These could be the same person, but you must have both of them
  • 13. BigML, Inc #DutchMLSchool 13 Now: Data and Algorithm • Instead of an expert we have data • Data can be easier to get (and is in some cases already there) • You can get a volume of data much larger than any expert could possibly see • Humans are notoriously bad at being good at things: • https://www.newscientist.com/article/mg21628930-400-specialist-knowledge-is-useless-and-unhelpful • Instead of a programmer we have a learning algorithm • Once you have the data in the proper format, learning algorithms work much faster (enabling iteration) • Learning algorithms are modular
  • 14. BigML, Inc #DutchMLSchool 14 Back to the Data Minutes Used Last Month’s Bill Calls To Support Website Visits Churn? 104 103,60 0 0 No 124 56,33 1 0 No 56 214,60 2 0 Yes 2410 305,60 0 5 No 536 145,70 0 0 No 234 122,09 0 1 No 201 185,76 1 7 Yes 111 83,60 3 2 No
  • 15. BigML, Inc #DutchMLSchool 15 The Goal: A Program that Predicts • The goal of learning is to take this sort of training data and create a program (a model or classifier or predictor) • This model takes as input a single row with a value for each of the columns given in the training data • The model will output its predicted value for the objective based on the given column values • Importantly this row can contain any values for the given columns, not just the ones seen in the training data
  • 16. BigML, Inc #DutchMLSchool 16 Just a Little Peek Under the Curtain
  • 17. BigML, Inc #DutchMLSchool 17 Behind The Scenes • A learning algorithm is: • A space of models that can be learned (a hypothesis space) • A clever way of searching through that space to find a “good” model • A good model is one that, for example, makes accurate predictions on the training data • So “machine learning” is finding a model amongst all possible models that has a good “fit” with the training data
  • 18. BigML, Inc #DutchMLSchool 18 A Simple Hypothesis Space • Suppose we tell our machine to split the data into two parts based on some threshold of some feature • If a data point is on one side of the threshold, we’ll predict the majority class of all the training points on that side • We can measure how many points in the training data would be correctly predicted using this method • This is how good our “fit” is to the training data • The best threshold is the one with the best fit (and we will try them all)
  • 19. BigML, Inc #DutchMLSchool 19 Back to the Data Minutes Used Last Month’s Bill Calls To Support Website Visits Churn? 104 103,60 0 0 No 124 56,33 1 0 No 56 214,60 2 0 Yes 2410 305,60 0 5 No 536 145,70 0 0 No 234 122,09 0 1 No 201 185,76 1 7 Yes 111 83,60 3 2 No
  • 20. BigML, Inc #DutchMLSchool 20 Minutes Used > 200 Minutes Used Last Month’s Bill Calls To Support Website Visits Churn? 104 103,60 0 0 No 124 56,33 1 0 No 56 214,60 2 0 Yes 2410 305,60 0 5 No 536 145,70 0 0 No 234 122,09 0 1 No 201 185,76 1 7 Yes 111 83,60 3 2 No
  • 21. BigML, Inc #DutchMLSchool 21 Website Visits > 0 Minutes Used Last Month’s Bill Calls To Support Website Visits Churn? 104 103,60 0 0 No 124 56,33 1 0 No 56 214,60 2 0 Yes 2410 305,60 0 5 No 536 145,70 0 0 No 234 122,09 0 1 No 201 185,76 1 7 Yes 111 83,60 3 2 No
  • 22. BigML, Inc #DutchMLSchool 22 Last Bill > $180 Minutes Used Last Month’s Bill Calls To Support Website Visits Churn? 104 103,60 0 0 No 124 56,33 1 0 No 56 214,60 2 0 Yes 2410 305,60 0 5 No 536 145,70 0 0 No 234 122,09 0 1 No 201 185,76 1 7 Yes 111 83,60 3 2 No
  • 23. BigML, Inc #DutchMLSchool 23 So Far, So Good! • This is basically what machine learning algorithms do • Try a solution and see how well it fits the training data • If “not well”, take some steps to “improve” it • There are many, many different ways of doing it, but this is usually what it boils down to
  • 24. BigML, Inc #DutchMLSchool 24 Evaluating and Improving
  • 25. BigML, Inc #DutchMLSchool 25 Now What? • The next thing is to use the training data to test the model • Split the data into training and test sets (machine learning is very good at memorizing the data) • Train a model on the training set • Evaluate it using the test set (or, the “held out data”) • We’ll get to the evaluation tool more fully later on
  • 26. BigML, Inc #DutchMLSchool 26 And Now? • Is the model good enough? • If not: • Different modeling approaches (model types, parameter tuning) • Better features (more information, transformations of the information you already have) • The more you fiddle with things, the more you contaminate your results (through overfitting) • Thus, if it’s “good enough”, it’s often best to leave it alone
  • 27. BigML, Inc #DutchMLSchool 27 Explanations
  • 28. BigML, Inc #DutchMLSchool 28 Field Importance • While our model is good, we don’t really have a good high level overview of why it thinks what it thinks • BigML supervised models provide this in the form of field importance under the model summary report
  • 29. BigML, Inc #DutchMLSchool 29 Individual Explanations • Individual predictions can be explained as well (as the model’s reasoning for a particular point can be different from the model at large) • Use the magnifying glass in the prediction form
  • 30. BigML, Inc #DutchMLSchool 30 Two Takeaways • When beginning a machine learning project, the more concrete the goal, the better. Numbers are the lifeblood of analytics so if you can quantify your objective(s), success is unlikely • Machine Learning isn’t the right solution for every problem! Be wary of your algorithm being replaced by a human! • “Before embarking on an ambitious project, try to kill it.” - Edsgar Dijkstra