SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
D E C E M B E R 8 - 9 , 2 0 1 6
BigML, Inc 2
Poul Petersen
CIO, BigML, Inc.
EnsemblesMaking Trees Unstoppable
BigML, Inc 3Ensembles
Ensemble Idea
• Rather than build a single model…
• Combine the output of several “weaker” models
into a powerful ensemble…
• Q1: Why would this work?
• Q2: How do we build “weaker” models?
• Q3: How do we “combine” models?
BigML, Inc 4Ensembles
Why Ensembles
1. Every “model” is an approximation of the “real” function
and there may be several good approximations.
2. ML Algorithms use random processes to solve NP-hard
problems and may arrive at different “models” depending
on the starting conditions, local optima, etc.
3. A given ML algorithm may not be able to exactly “model”
the real characteristics of a particular dataset.
4. Anomalies in the data may cause over-fitting, that is trying
to model behavior that should be ignored. By using several
models, the outliers may be averaged out.
In any case, if we find several accurate “models”, the
combination may be closer to the real “model”
BigML, Inc 5
Ensemble Demo #1
BigML, Inc 6Ensembles
Weaker Models?
1. Bootstrap Aggregating - aka “Bagging” If there are “n”
instances, each tree is trained with “n” instances, but they
are sampled with replacement.
2. Random Decision Forest - In addition to sampling with
replacement, the tree randomly selects a subset of
features to consider when making each split. This
introduces a new parameter, the random candidates
which is the number of features to randomly select before
making the split.
BigML, Inc 7Ensembles
Over-Fitting Example
Diameter Color Shape Fruit
4 red round plum
5 red round apple
5 red round apple
6 red round plum
7 red round apple
Bagging!
Random Decision Forest!
All Data: “plum”
Sample 2: “apple”
Sample 3: “apple”
Sample 1: “plum”
}“apple”
What is a round, red 6cm fruit?
BigML, Inc 8Ensembles
Voting Methods
1. Plurality - majority wins.
2. Confidence Weighted - majority wins but each vote is
weighted by the confidence.
3. Probability Weighted - each tree votes the distribution at
it’s leaf node.
4. K Threshold - only votes if the specified class and required
number of trees is met. For example, allowing a “True” vote if
and only if at least 9 out of 10 trees vote “True”.
5. Confidence Threshold - only votes the specified class if
the minimum confidence is met.
Linear and non-linear combinations of votes using stacking
BigML, Inc 9
Ensemble Demo #2
BigML, Inc 10Ensembles
Model vs Bagging vs RF
Model Bagging Random Forest
Increasing Performance
Decreasing Interpretability
Increasing Stochasticity
Increasing Complexity
BigML, Inc 11
Ensemble Demo #3
BigML, Inc 12Ensembles
SMACdown
• How many trees?
• How many nodes?
• Missing splits?
• Random candidates?
• Too many parameters?
BigML, Inc 13
Poul Petersen
CIO, BigML, Inc.
Logistic Regression
Modeling Probabilities
BigML, Inc 14Logistic Regression
Logistic Regression
• Classification implies a discrete objective. How
can this be a regression?
• Why do we need another classification
algorithm?
• more questions….
Logistic Regression is a classification algorithm
BigML, Inc 15Logistic Regression
Linear Regression
BigML, Inc 16Logistic Regression
Linear Regression
BigML, Inc 17Logistic Regression
Polynomial Regression
BigML, Inc 18Logistic Regression
Regression
• What function can we fit to discrete data?
Key Take-Away: Fitting a function to the data
BigML, Inc 19Logistic Regression
Discrete Data Function?
BigML, Inc 20Logistic Regression
Discrete Data Function?
????
BigML, Inc 21Logistic Regression
Logistic Function
•x→-∞ : f(x)→0
•x→∞ : f(x)→1
•Looks promising, but still not 

"discrete"
BigML, Inc 22Logistic Regression
Probabilities
P≈0 P≈10<P<1
BigML, Inc 23Logistic Regression
Logistic Regression
• Assumes that output is linearly related to
"predictors"

… but we can "fix" this with feature engineering
• How do we "fit" the logistic function to real data?
LR is a classification algorithm … that models
the probability of the output class.
BigML, Inc 24Logistic Regression
Logistic Regression
β₀ is the "intercept"
β₁ is the "coefficient"
The inverse of the logistic function is called the "logit":
In which case solving is now a linear regression
BigML, Inc 25Logistic Regression
Logistic Regression
If we have multiple dimensions, add more coefficients:
BigML, Inc 26
Logistic Regression
Demo #1
BigML, Inc 27Logistic Regression
LR Parameters
1. Bias: Allows an intercept term.
Important if P(x=0) != 0
2. Regularization:
• L1: prefers zeroing individual coefficients
• L2: prefers pushing all coefficients towards zero
3. EPS: The minimum error between steps to stop.
4. Auto-scaling: Ensures that all features contribute
equally.
• Unless there is a specific need to not auto-scale,
it is recommended.
BigML, Inc 28Logistic Regression
Logistic Regression
• How do we handle multiple classes?
• What about non-numeric inputs?
Questions:
BigML, Inc 29Logistic Regression
LR - Multi Class
• Instead of a binary class ex: [ true, false ], we have multi-
class ex: [ red, green, blue, … ]
• k classes
• solve one-vs-rest LR
• coefficients βᵢ for 

each class
BigML, Inc 30Logistic Regression
LR - Field Codings
• LR is expecting numeric values to perform regression.
• How do we handle categorical values, or text?
Class color=red color=blue color=green color=NULL
red 1 0 0 0
blue 0 1 0 0
green 0 0 1 0
NULL 0 0 0 1
One-hot encoding
Only one feature is "hot" for each class
BigML, Inc 31Logistic Regression
LR - Field Codings
Dummy Encoding
Chooses a *reference class*
requires one less degree of freedom
Class color_1 color_2 color_3
*red* 0 0 0
blue 1 0 0
green 0 1 0
NULL 0 0 1
BigML, Inc 32Logistic Regression
LR - Field Codings
Contrast Encoding
Field values must sum to zero
Allows comparison between classes
…. so which one?
Class field "influence"
red 0,5 positive
blue -0,25 negative
green -0,25 negative
NULL 0 excluded
BigML, Inc 33Logistic Regression
LR - Field Codings
• The "text" type gives us new features that have
counts of the number of times each token occurs in
the text field. "Items" can be treated the same way.
token "hippo" "safari" "zebra"
instance_1 3 0 1
instance_2 0 11 4
instance_3 0 0 0
instance_4 1 0 3
Text / Items ?
BigML, Inc 34
Logistic Regression
Demo #2
BigML, Inc 35Logistic Regression
Curvilinear LR
Instead of
We could add a feature
Where
????
Possible to add any higher order terms or other functions to
match shape of data
BigML, Inc 36
Logistic Regression
Demo #3
BigML, Inc 37Logistic Regression
LR versus DT
• Expects a "smooth" linear
relationship with predictors.
• LR is concerned with probability of
a discrete outcome.
• Lots of parameters to get wrong: 

regularization, scaling, codings
• Slightly less prone to over-fitting

• Because fits a shape, might work
better when less data available.

• Adapts well to ragged non-linear
relationships
• No concern: classification,
regression, multi-class all fine.
• Virtually parameter free

• Slightly more prone to over-fitting

• Prefers surfaces parallel to
parameter axes, but given enough
data will discover any shape.
Logistic Regression Decision Tree
BigML, Inc 38
Logistic Regression
Demo #4
BSSML16 L2. Ensembles and Logistic Regressions

Más contenido relacionado

La actualidad más candente

Interpretable Machine Learning Using LIME Framework - Kasia Kulma (PhD), Data...
Interpretable Machine Learning Using LIME Framework - Kasia Kulma (PhD), Data...Interpretable Machine Learning Using LIME Framework - Kasia Kulma (PhD), Data...
Interpretable Machine Learning Using LIME Framework - Kasia Kulma (PhD), Data...
Sri Ambati
 
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Sri Ambati
 
The Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistThe Incredible Disappearing Data Scientist
The Incredible Disappearing Data Scientist
Rebecca Bilbro
 

La actualidad más candente (20)

VSSML16 L5. Basic Data Transformations
VSSML16 L5. Basic Data TransformationsVSSML16 L5. Basic Data Transformations
VSSML16 L5. Basic Data Transformations
 
BSSML17 - Logistic Regressions
BSSML17 - Logistic RegressionsBSSML17 - Logistic Regressions
BSSML17 - Logistic Regressions
 
BSSML17 - Basic Data Transformations
BSSML17 - Basic Data TransformationsBSSML17 - Basic Data Transformations
BSSML17 - Basic Data Transformations
 
BSSML17 - Ensembles
BSSML17 - EnsemblesBSSML17 - Ensembles
BSSML17 - Ensembles
 
BSSML17 - Clusters
BSSML17 - ClustersBSSML17 - Clusters
BSSML17 - Clusters
 
VSSML17 L5. Basic Data Transformations and Feature Engineering
VSSML17 L5. Basic Data Transformations and Feature EngineeringVSSML17 L5. Basic Data Transformations and Feature Engineering
VSSML17 L5. Basic Data Transformations and Feature Engineering
 
VSSML17 Review. Summary Day 1 Sessions
VSSML17 Review. Summary Day 1 SessionsVSSML17 Review. Summary Day 1 Sessions
VSSML17 Review. Summary Day 1 Sessions
 
Feature Engineering
Feature Engineering Feature Engineering
Feature Engineering
 
BSSML17 - Anomaly Detection
BSSML17 - Anomaly DetectionBSSML17 - Anomaly Detection
BSSML17 - Anomaly Detection
 
L5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature EngineeringL5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature Engineering
 
DutchMLSchool. ML: A Technical Perspective
DutchMLSchool. ML: A Technical PerspectiveDutchMLSchool. ML: A Technical Perspective
DutchMLSchool. ML: A Technical Perspective
 
Feature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditFeature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax audit
 
VSSML17 L3. Clusters and Anomaly Detection
VSSML17 L3. Clusters and Anomaly DetectionVSSML17 L3. Clusters and Anomaly Detection
VSSML17 L3. Clusters and Anomaly Detection
 
Interpretable Machine Learning Using LIME Framework - Kasia Kulma (PhD), Data...
Interpretable Machine Learning Using LIME Framework - Kasia Kulma (PhD), Data...Interpretable Machine Learning Using LIME Framework - Kasia Kulma (PhD), Data...
Interpretable Machine Learning Using LIME Framework - Kasia Kulma (PhD), Data...
 
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
 
L11. The Future of Machine Learning
L11. The Future of Machine LearningL11. The Future of Machine Learning
L11. The Future of Machine Learning
 
Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)
 
BigML Education - Feature Engineering with Flatline
BigML Education - Feature Engineering with FlatlineBigML Education - Feature Engineering with Flatline
BigML Education - Feature Engineering with Flatline
 
The Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistThe Incredible Disappearing Data Scientist
The Incredible Disappearing Data Scientist
 
Fairly Measuring Fairness In Machine Learning
Fairly Measuring Fairness In Machine LearningFairly Measuring Fairness In Machine Learning
Fairly Measuring Fairness In Machine Learning
 

Destacado

ICON Company Profile
ICON Company ProfileICON Company Profile
ICON Company Profile
ICON ~
 

Destacado (20)

Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
How is Watson Changing the Future of the Automative Industry?
How is Watson Changing the Future of the Automative Industry?How is Watson Changing the Future of the Automative Industry?
How is Watson Changing the Future of the Automative Industry?
 
Leading in the Cognitive Age
Leading in the Cognitive AgeLeading in the Cognitive Age
Leading in the Cognitive Age
 
API, WhizzML and Apps
API, WhizzML and AppsAPI, WhizzML and Apps
API, WhizzML and Apps
 
It's All E-commerce
It's All E-commerceIt's All E-commerce
It's All E-commerce
 
e-Commerce Academy - Winning Consumer Market from Online to Offline in Mobile...
e-Commerce Academy - Winning Consumer Market from Online to Offline in Mobile...e-Commerce Academy - Winning Consumer Market from Online to Offline in Mobile...
e-Commerce Academy - Winning Consumer Market from Online to Offline in Mobile...
 
ICON Company Profile
ICON Company ProfileICON Company Profile
ICON Company Profile
 
Rajiv bajaj
Rajiv bajajRajiv bajaj
Rajiv bajaj
 
Bretagne at ICOE 2016
Bretagne at ICOE 2016Bretagne at ICOE 2016
Bretagne at ICOE 2016
 
Frameworks and development of supply chain information architecture
Frameworks and development of supply chain information architectureFrameworks and development of supply chain information architecture
Frameworks and development of supply chain information architecture
 
A competency based human resources architecture - ppt
A competency based human resources architecture - pptA competency based human resources architecture - ppt
A competency based human resources architecture - ppt
 
Modulushca it approach for physical internet and modular logistics - v2.0
Modulushca   it approach for physical internet and modular logistics - v2.0Modulushca   it approach for physical internet and modular logistics - v2.0
Modulushca it approach for physical internet and modular logistics - v2.0
 
Develop An End to End Process Architecture
Develop An End to End Process ArchitectureDevelop An End to End Process Architecture
Develop An End to End Process Architecture
 
E commerce MODEL
E commerce MODELE commerce MODEL
E commerce MODEL
 
Logistic Management in India
Logistic Management in IndiaLogistic Management in India
Logistic Management in India
 
Design, usability and information architecture for e-commerce - Gaëtan Belbéoc'h
Design, usability and information architecture for e-commerce - Gaëtan Belbéoc'hDesign, usability and information architecture for e-commerce - Gaëtan Belbéoc'h
Design, usability and information architecture for e-commerce - Gaëtan Belbéoc'h
 
Information Technology Architecture in Supply Chain Management
Information Technology Architecture in Supply Chain ManagementInformation Technology Architecture in Supply Chain Management
Information Technology Architecture in Supply Chain Management
 
Icon Design
Icon DesignIcon Design
Icon Design
 
Intro to Logistic Regression
Intro to Logistic RegressionIntro to Logistic Regression
Intro to Logistic Regression
 
What Does It Mean To Be A Cognitive Company
What Does It Mean To Be  A Cognitive CompanyWhat Does It Mean To Be  A Cognitive Company
What Does It Mean To Be A Cognitive Company
 

Similar a BSSML16 L2. Ensembles and Logistic Regressions

Declarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemTDeclarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemT
Laura Chiticariu
 
Tensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdTensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with Hummingbird
Databricks
 
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Rebecca Bilbro
 

Similar a BSSML16 L2. Ensembles and Logistic Regressions (20)

DutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesDutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time Series
 
MLSEV. Logistic Regression, Deepnets, and Time Series
MLSEV. Logistic Regression, Deepnets, and Time Series MLSEV. Logistic Regression, Deepnets, and Time Series
MLSEV. Logistic Regression, Deepnets, and Time Series
 
VSSML18. Ensembles and Logistic Regressions
VSSML18. Ensembles and Logistic RegressionsVSSML18. Ensembles and Logistic Regressions
VSSML18. Ensembles and Logistic Regressions
 
VSSML17 L2. Ensembles and Logistic Regressions
VSSML17 L2. Ensembles and Logistic RegressionsVSSML17 L2. Ensembles and Logistic Regressions
VSSML17 L2. Ensembles and Logistic Regressions
 
20 Simple CART
20 Simple CART20 Simple CART
20 Simple CART
 
DutchMLSchool. Automating Decision Making
DutchMLSchool. Automating Decision MakingDutchMLSchool. Automating Decision Making
DutchMLSchool. Automating Decision Making
 
A machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discoveryA machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discovery
 
Declarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemTDeclarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemT
 
Steering Model Selection with Visual Diagnostics
Steering Model Selection with Visual DiagnosticsSteering Model Selection with Visual Diagnostics
Steering Model Selection with Visual Diagnostics
 
Lecture 13
Lecture 13Lecture 13
Lecture 13
 
BigML Summer 2016 Release
BigML Summer 2016 ReleaseBigML Summer 2016 Release
BigML Summer 2016 Release
 
Scalable Product Line Configuration - ASE 2013 Palo Alto, CA
Scalable Product Line Configuration - ASE 2013 Palo Alto, CAScalable Product Line Configuration - ASE 2013 Palo Alto, CA
Scalable Product Line Configuration - ASE 2013 Palo Alto, CA
 
BSSML17 - Deepnets
BSSML17 - DeepnetsBSSML17 - Deepnets
BSSML17 - Deepnets
 
VSSML18. Deepnets and Time Series
VSSML18. Deepnets and Time SeriesVSSML18. Deepnets and Time Series
VSSML18. Deepnets and Time Series
 
Tensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdTensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with Hummingbird
 
Web UI, Algorithms, and Feature Engineering
Web UI, Algorithms, and Feature Engineering Web UI, Algorithms, and Feature Engineering
Web UI, Algorithms, and Feature Engineering
 
Intepretable Machine Learning
Intepretable Machine LearningIntepretable Machine Learning
Intepretable Machine Learning
 
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
 
WIA 2019 - Steering Model Selection with Visual Diagnostics
WIA 2019 - Steering Model Selection with Visual DiagnosticsWIA 2019 - Steering Model Selection with Visual Diagnostics
WIA 2019 - Steering Model Selection with Visual Diagnostics
 
VSSML18. OptiML and Fusions
VSSML18. OptiML and FusionsVSSML18. OptiML and Fusions
VSSML18. OptiML and Fusions
 

Más de BigML, Inc

Más de BigML, Inc (20)

Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in Manufacturing
 
DutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationDutchMLSchool 2022 - Automation
DutchMLSchool 2022 - Automation
 
DutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceDutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML Compliance
 
DutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesDutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective Anomalies
 
DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector
 
DutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionDutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly Detection
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in ML
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End ML
 
DutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyDutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven Company
 
DutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorDutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal Sector
 
DutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsDutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe Stadiums
 
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsDutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
 
DutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleDutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at Scale
 
DutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIDutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AI
 
Democratizing Object Detection
Democratizing Object DetectionDemocratizing Object Detection
Democratizing Object Detection
 
BigML Release: Image Processing
BigML Release: Image ProcessingBigML Release: Image Processing
BigML Release: Image Processing
 
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureMachine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
 
Machine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorMachine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail Sector
 
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
 

Último

Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
gajnagarg
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
amitlee9823
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 

Último (20)

Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 

BSSML16 L2. Ensembles and Logistic Regressions

  • 1. D E C E M B E R 8 - 9 , 2 0 1 6
  • 2. BigML, Inc 2 Poul Petersen CIO, BigML, Inc. EnsemblesMaking Trees Unstoppable
  • 3. BigML, Inc 3Ensembles Ensemble Idea • Rather than build a single model… • Combine the output of several “weaker” models into a powerful ensemble… • Q1: Why would this work? • Q2: How do we build “weaker” models? • Q3: How do we “combine” models?
  • 4. BigML, Inc 4Ensembles Why Ensembles 1. Every “model” is an approximation of the “real” function and there may be several good approximations. 2. ML Algorithms use random processes to solve NP-hard problems and may arrive at different “models” depending on the starting conditions, local optima, etc. 3. A given ML algorithm may not be able to exactly “model” the real characteristics of a particular dataset. 4. Anomalies in the data may cause over-fitting, that is trying to model behavior that should be ignored. By using several models, the outliers may be averaged out. In any case, if we find several accurate “models”, the combination may be closer to the real “model”
  • 6. BigML, Inc 6Ensembles Weaker Models? 1. Bootstrap Aggregating - aka “Bagging” If there are “n” instances, each tree is trained with “n” instances, but they are sampled with replacement. 2. Random Decision Forest - In addition to sampling with replacement, the tree randomly selects a subset of features to consider when making each split. This introduces a new parameter, the random candidates which is the number of features to randomly select before making the split.
  • 7. BigML, Inc 7Ensembles Over-Fitting Example Diameter Color Shape Fruit 4 red round plum 5 red round apple 5 red round apple 6 red round plum 7 red round apple Bagging! Random Decision Forest! All Data: “plum” Sample 2: “apple” Sample 3: “apple” Sample 1: “plum” }“apple” What is a round, red 6cm fruit?
  • 8. BigML, Inc 8Ensembles Voting Methods 1. Plurality - majority wins. 2. Confidence Weighted - majority wins but each vote is weighted by the confidence. 3. Probability Weighted - each tree votes the distribution at it’s leaf node. 4. K Threshold - only votes if the specified class and required number of trees is met. For example, allowing a “True” vote if and only if at least 9 out of 10 trees vote “True”. 5. Confidence Threshold - only votes the specified class if the minimum confidence is met. Linear and non-linear combinations of votes using stacking
  • 10. BigML, Inc 10Ensembles Model vs Bagging vs RF Model Bagging Random Forest Increasing Performance Decreasing Interpretability Increasing Stochasticity Increasing Complexity
  • 12. BigML, Inc 12Ensembles SMACdown • How many trees? • How many nodes? • Missing splits? • Random candidates? • Too many parameters?
  • 13. BigML, Inc 13 Poul Petersen CIO, BigML, Inc. Logistic Regression Modeling Probabilities
  • 14. BigML, Inc 14Logistic Regression Logistic Regression • Classification implies a discrete objective. How can this be a regression? • Why do we need another classification algorithm? • more questions…. Logistic Regression is a classification algorithm
  • 15. BigML, Inc 15Logistic Regression Linear Regression
  • 16. BigML, Inc 16Logistic Regression Linear Regression
  • 17. BigML, Inc 17Logistic Regression Polynomial Regression
  • 18. BigML, Inc 18Logistic Regression Regression • What function can we fit to discrete data? Key Take-Away: Fitting a function to the data
  • 19. BigML, Inc 19Logistic Regression Discrete Data Function?
  • 20. BigML, Inc 20Logistic Regression Discrete Data Function? ????
  • 21. BigML, Inc 21Logistic Regression Logistic Function •x→-∞ : f(x)→0 •x→∞ : f(x)→1 •Looks promising, but still not 
 "discrete"
  • 22. BigML, Inc 22Logistic Regression Probabilities P≈0 P≈10<P<1
  • 23. BigML, Inc 23Logistic Regression Logistic Regression • Assumes that output is linearly related to "predictors"
 … but we can "fix" this with feature engineering • How do we "fit" the logistic function to real data? LR is a classification algorithm … that models the probability of the output class.
  • 24. BigML, Inc 24Logistic Regression Logistic Regression β₀ is the "intercept" β₁ is the "coefficient" The inverse of the logistic function is called the "logit": In which case solving is now a linear regression
  • 25. BigML, Inc 25Logistic Regression Logistic Regression If we have multiple dimensions, add more coefficients:
  • 26. BigML, Inc 26 Logistic Regression Demo #1
  • 27. BigML, Inc 27Logistic Regression LR Parameters 1. Bias: Allows an intercept term. Important if P(x=0) != 0 2. Regularization: • L1: prefers zeroing individual coefficients • L2: prefers pushing all coefficients towards zero 3. EPS: The minimum error between steps to stop. 4. Auto-scaling: Ensures that all features contribute equally. • Unless there is a specific need to not auto-scale, it is recommended.
  • 28. BigML, Inc 28Logistic Regression Logistic Regression • How do we handle multiple classes? • What about non-numeric inputs? Questions:
  • 29. BigML, Inc 29Logistic Regression LR - Multi Class • Instead of a binary class ex: [ true, false ], we have multi- class ex: [ red, green, blue, … ] • k classes • solve one-vs-rest LR • coefficients βᵢ for 
 each class
  • 30. BigML, Inc 30Logistic Regression LR - Field Codings • LR is expecting numeric values to perform regression. • How do we handle categorical values, or text? Class color=red color=blue color=green color=NULL red 1 0 0 0 blue 0 1 0 0 green 0 0 1 0 NULL 0 0 0 1 One-hot encoding Only one feature is "hot" for each class
  • 31. BigML, Inc 31Logistic Regression LR - Field Codings Dummy Encoding Chooses a *reference class* requires one less degree of freedom Class color_1 color_2 color_3 *red* 0 0 0 blue 1 0 0 green 0 1 0 NULL 0 0 1
  • 32. BigML, Inc 32Logistic Regression LR - Field Codings Contrast Encoding Field values must sum to zero Allows comparison between classes …. so which one? Class field "influence" red 0,5 positive blue -0,25 negative green -0,25 negative NULL 0 excluded
  • 33. BigML, Inc 33Logistic Regression LR - Field Codings • The "text" type gives us new features that have counts of the number of times each token occurs in the text field. "Items" can be treated the same way. token "hippo" "safari" "zebra" instance_1 3 0 1 instance_2 0 11 4 instance_3 0 0 0 instance_4 1 0 3 Text / Items ?
  • 34. BigML, Inc 34 Logistic Regression Demo #2
  • 35. BigML, Inc 35Logistic Regression Curvilinear LR Instead of We could add a feature Where ???? Possible to add any higher order terms or other functions to match shape of data
  • 36. BigML, Inc 36 Logistic Regression Demo #3
  • 37. BigML, Inc 37Logistic Regression LR versus DT • Expects a "smooth" linear relationship with predictors. • LR is concerned with probability of a discrete outcome. • Lots of parameters to get wrong: 
 regularization, scaling, codings • Slightly less prone to over-fitting
 • Because fits a shape, might work better when less data available.
 • Adapts well to ragged non-linear relationships • No concern: classification, regression, multi-class all fine. • Virtually parameter free
 • Slightly more prone to over-fitting
 • Prefers surfaces parallel to parameter axes, but given enough data will discover any shape. Logistic Regression Decision Tree
  • 38. BigML, Inc 38 Logistic Regression Demo #4