SlideShare una empresa de Scribd logo
1 de 79
Data Mining with Decision Trees: An Introduction to CART ® Dan Steinberg Mikhail Golovnya [email_address] http://www.salford-systems.com
In The Beginning… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Years of Struggle  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Final Triumph ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
So What is CART? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Typical CART Solution ,[object Object],[object Object],[object Object],PATIENTS = 215 SURVIVE 178 82.8% DEAD 37 17.2% Is BP<=91? Terminal Node A SURVIVE 6 30.0% DEAD 14 70.0% NODE = DEAD Terminal Node B SURVIVE 102 98.1% DEAD 2   1.9% NODE = SURVIVE PATIENTS = 195 SURVIVE 172 88.2% DEAD 23 11.8% Is AGE<=62.5? Terminal Node C SURVIVE 14 50.0% DEAD 14 50.0% NODE = DEAD PATIENTS = 91 SURVIVE 70 76.9% DEAD 21 23.1% Is SINUS<=.5? Terminal Node D SURVIVE 56 88.9% DEAD 7 11.1% NODE = SURVIVE <= 91 > 91 <= 62.5 > 62.5 >.5 <=.5
How to Read It ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Decision Questions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Tree is a Classifier ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Importance of Binary Splits ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Accuracy of a Tree ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Prediction Success Table ,[object Object],[object Object],Classified As TRUE Class 1 Class 2 &quot;Survivors&quot; &quot;Early Deaths&quot; Total % Correct Class 1 &quot;Survivors&quot; 158 20 178 88% Class 2 &quot;Early Deaths&quot; 9 28 37 75% Total 167 48 215 86.51%
Tree Interpretation and Use ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Binary Recursive Partitioning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Three Major Stages ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
General Workflow Stage 1 Stage 2  Stage 3 Historical Data Learn Test Validate Build a Sequence of Nested Trees Monitor Performance Best Confirm Findings
Large Trees and Nearest Neighbor Classifier ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Derivation ,[object Object],[object Object],[object Object],[object Object],[object Object]
Illustration ,[object Object]
Impact on the Relative Error ,[object Object],[object Object]
Searching All Possible Splits ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Split Tables ,[object Object],[object Object],Sorted by Age Sorted by Blood Pressure
Categorical Predictors ,[object Object],[object Object],[object Object],[object Object],Left Right 1   A B, C, D 2   B A, C, B 3  C A, B, D 4   D A, B, C 5  A, B C, D 6  A, C B, D 7  A, D B, C
Binary Target Shortcut ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
GYM Model Example ,[object Object],[object Object],[object Object],[object Object],[object Object]
Variable Definitions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],predictors not used target
Initial Runs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Pruning Multiple Nodes ,[object Object],[object Object],[object Object]
Pruning Weaker Split First ,[object Object],[object Object]
Understanding Variable Importance ,[object Object],[object Object],[object Object],[object Object]
Competitors and Surrogates ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Utility of Surrogates ,[object Object],[object Object],[object Object],[object Object],[object Object]
Competitors and Surrogates Are Different   ,[object Object],[object Object],[object Object],[object Object],A B C A B C Split X A B C A C B Split Y ,[object Object],[object Object],[object Object]
Calculating Association ,[object Object],[object Object],[object Object],[object Object],          Split X Split Y           Default           ,[object Object]
Notes on Association ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Calculating Variable Importance ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Mystery Solved ,[object Object],[object Object]
Penalizing Individual Variables ,[object Object],[object Object],[object Object]
Penalizing Missing and Categorical Predictors ,[object Object],[object Object],[object Object],[object Object]
Splitting Rules - Introduction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Priors Adjusted Probabilities ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
GINI Splitting Rule ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Illustration
GINI Properties ,[object Object],[object Object],[object Object]
Example ,[object Object],[object Object]
ENTROPY Splitting Rule ,[object Object],[object Object],[object Object],[object Object],[object Object]
Example ,[object Object],[object Object]
TWOING Splitting Rule ,[object Object],[object Object],[object Object],[object Object],[object Object]
Example ,[object Object],[object Object]
Best Possible Split ,[object Object],[object Object],[object Object],[object Object],[object Object]
Example ,[object Object],A 40 B 30 C 20 D 10 A 40 B 30 C 20 D 10 GINI Best Split A 40 B 30 C 20 D 10 A 40 D 10 B 30 C 20 TWOING Best Split
Symmetric GINI ,[object Object],[object Object],[object Object],[object Object]
Ordered TWOING ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Class Probability Rule ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Example
Favor Even Splits Control ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Fundamental Tree Building Mechanisms ,[object Object],[object Object],[object Object]
Key Players ,[object Object],DM Engine Analyst Model Dataset Priors    i Costs   C i  j Population
Fraud Example ,[object Object],[object Object],[object Object],[object Object]
Fraud Example – PRIORS EQUAL (0.5, 0.5) ,[object Object],[object Object],[object Object],[object Object]
Fraud Example – PRIORS SPECIFY (0.9, 0.1) ,[object Object],[object Object],[object Object],[object Object]
Fraud Example – PRIORS DATA (0.98, 0.02) ,[object Object],[object Object],[object Object],[object Object]
Internal Class Assignment Rule ,[object Object],[object Object],[object Object],[object Object]
Putting It All Together ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Impact of Priors ,[object Object],[object Object],[object Object], r =0.1 ,   b =0.9  r =0.9 ,   b =0.1  r =0.5 ,   b =0.5
Model Evaluation ,[object Object],Evaluator Analyst Expected Cost Dataset Priors    i Costs   C i  j Model PS Table Cells   n i  j Population
Estimating Expected Cost ,[object Object],[object Object],[object Object],[object Object]
PRIORS DATA – Relative Cost ,[object Object],[object Object],[object Object],[object Object],[object Object]
PRIORS EQUAL – Relative Cost ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Using Equivalent Costs ,[object Object],[object Object]
Automated CART Runs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
More Batteries ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Battery MCT ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Battery MVI ,[object Object],[object Object],[object Object],[object Object],[object Object]
Battery PRIOR ,[object Object],[object Object],[object Object],[object Object],[object Object]
Battery SHAVING ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Battery SAMPLE ,[object Object],[object Object],[object Object]
Battery TARGET ,[object Object],[object Object],[object Object],[object Object]
Recommended Reading ,[object Object],[object Object],[object Object],[object Object]

Más contenido relacionado

La actualidad más candente

Cluster analysis using spss
Cluster analysis using spssCluster analysis using spss
Cluster analysis using spssDr Nisha Arora
 
Cluster spss week7
Cluster spss week7Cluster spss week7
Cluster spss week7Birat Sharma
 
13 random forest
13 random forest13 random forest
13 random forestVishal Dutt
 
An Algorithm Analysis on Data Mining-396
An Algorithm Analysis on Data Mining-396An Algorithm Analysis on Data Mining-396
An Algorithm Analysis on Data Mining-396Nida Rashid
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetSalford Systems
 
Medical Statistics Part-I:Descriptive statistics
Medical Statistics Part-I:Descriptive statisticsMedical Statistics Part-I:Descriptive statistics
Medical Statistics Part-I:Descriptive statisticsRamachandra Barik
 
Review of Basic Statistics and Terminology
Review of Basic Statistics and TerminologyReview of Basic Statistics and Terminology
Review of Basic Statistics and Terminologyaswhite
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Modelsguest0edcaf
 
Nonparametric tests assignment
Nonparametric tests assignmentNonparametric tests assignment
Nonparametric tests assignmentROOHASHAHID1
 
13 Machine Learning Supervised Decision Trees
13 Machine Learning Supervised Decision Trees13 Machine Learning Supervised Decision Trees
13 Machine Learning Supervised Decision TreesAndres Mendez-Vazquez
 
PG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data AnalysisPG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data AnalysisAashish Patel
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis緯鈞 沈
 
Integrating Fuzzy Dematel and SMAA-2 for Maintenance Expenses
Integrating Fuzzy Dematel and SMAA-2 for Maintenance ExpensesIntegrating Fuzzy Dematel and SMAA-2 for Maintenance Expenses
Integrating Fuzzy Dematel and SMAA-2 for Maintenance Expensesinventionjournals
 

La actualidad más candente (17)

Cluster analysis using spss
Cluster analysis using spssCluster analysis using spss
Cluster analysis using spss
 
Eda sri
Eda sriEda sri
Eda sri
 
Cluster spss week7
Cluster spss week7Cluster spss week7
Cluster spss week7
 
13 random forest
13 random forest13 random forest
13 random forest
 
16 Simple CART
16 Simple CART16 Simple CART
16 Simple CART
 
An Algorithm Analysis on Data Mining-396
An Algorithm Analysis on Data Mining-396An Algorithm Analysis on Data Mining-396
An Algorithm Analysis on Data Mining-396
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example Dataset
 
Medical Statistics Part-I:Descriptive statistics
Medical Statistics Part-I:Descriptive statisticsMedical Statistics Part-I:Descriptive statistics
Medical Statistics Part-I:Descriptive statistics
 
Data analysis01 singlevariable
Data analysis01 singlevariableData analysis01 singlevariable
Data analysis01 singlevariable
 
Review of Basic Statistics and Terminology
Review of Basic Statistics and TerminologyReview of Basic Statistics and Terminology
Review of Basic Statistics and Terminology
 
Lesson 1 07 measures of variation
Lesson 1 07 measures of variationLesson 1 07 measures of variation
Lesson 1 07 measures of variation
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
 
Nonparametric tests assignment
Nonparametric tests assignmentNonparametric tests assignment
Nonparametric tests assignment
 
13 Machine Learning Supervised Decision Trees
13 Machine Learning Supervised Decision Trees13 Machine Learning Supervised Decision Trees
13 Machine Learning Supervised Decision Trees
 
PG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data AnalysisPG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data Analysis
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Integrating Fuzzy Dematel and SMAA-2 for Maintenance Expenses
Integrating Fuzzy Dematel and SMAA-2 for Maintenance ExpensesIntegrating Fuzzy Dematel and SMAA-2 for Maintenance Expenses
Integrating Fuzzy Dematel and SMAA-2 for Maintenance Expenses
 

Destacado

Floidbox -Weird photos
Floidbox -Weird photosFloidbox -Weird photos
Floidbox -Weird photosFlorin Levarda
 
Prelim Evaluation
Prelim EvaluationPrelim Evaluation
Prelim Evaluationnctcmedia12
 
Теплица "Презент"
Теплица "Презент"Теплица "Презент"
Теплица "Презент"Al Maks
 
Weird photos - pictures
Weird photos - picturesWeird photos - pictures
Weird photos - picturesFlorin Levarda
 
Yaksha prasna questions of yaksha
Yaksha prasna   questions of yakshaYaksha prasna   questions of yaksha
Yaksha prasna questions of yakshaBASKARAN P
 
Сандеро
СандероСандеро
СандероAl Maks
 
Why invest in azuero October 1st 2012
Why invest in azuero October 1st 2012Why invest in azuero October 1st 2012
Why invest in azuero October 1st 2012Cubita Panama
 
Icabihal sayi 3
Icabihal sayi 3Icabihal sayi 3
Icabihal sayi 3kolormatik
 
PERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOS
PERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOSPERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOS
PERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOSMaria Santos
 
Музыка перевода. Избранные работы за 2010 г.
Музыка перевода. Избранные работы за 2010 г.Музыка перевода. Избранные работы за 2010 г.
Музыка перевода. Избранные работы за 2010 г.Вениамин Бакалинский
 
Digital Workplace by Lizard Soft
Digital Workplace by Lizard SoftDigital Workplace by Lizard Soft
Digital Workplace by Lizard SoftIgor Petrushyn
 
Prelim R&P of Similar Products
Prelim R&P of Similar ProductsPrelim R&P of Similar Products
Prelim R&P of Similar Productsnctcmedia12
 
How to look for journal articles using ebsco host_1010S
How to look for journal articles using ebsco host_1010SHow to look for journal articles using ebsco host_1010S
How to look for journal articles using ebsco host_1010Smchiware
 
Create an Assignment in MBC
Create an Assignment in MBCCreate an Assignment in MBC
Create an Assignment in MBCMaryAnn Medved
 
Research Into Target Audience
Research Into Target AudienceResearch Into Target Audience
Research Into Target Audiencenctcmedia12
 
Finding the meaning in meaningful use
Finding the meaning in meaningful useFinding the meaning in meaningful use
Finding the meaning in meaningful usegdabate
 

Destacado (20)

Floidbox -Weird photos
Floidbox -Weird photosFloidbox -Weird photos
Floidbox -Weird photos
 
Edgar allan poe
Edgar allan poeEdgar allan poe
Edgar allan poe
 
Prelim Evaluation
Prelim EvaluationPrelim Evaluation
Prelim Evaluation
 
Теплица "Презент"
Теплица "Презент"Теплица "Презент"
Теплица "Презент"
 
Weird photos - pictures
Weird photos - picturesWeird photos - pictures
Weird photos - pictures
 
Yaksha prasna questions of yaksha
Yaksha prasna   questions of yakshaYaksha prasna   questions of yaksha
Yaksha prasna questions of yaksha
 
Regular verbs
Regular verbsRegular verbs
Regular verbs
 
Сандеро
СандероСандеро
Сандеро
 
Why invest in azuero October 1st 2012
Why invest in azuero October 1st 2012Why invest in azuero October 1st 2012
Why invest in azuero October 1st 2012
 
Icabihal sayi 3
Icabihal sayi 3Icabihal sayi 3
Icabihal sayi 3
 
PERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOS
PERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOSPERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOS
PERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOS
 
Музыка перевода. Избранные работы за 2010 г.
Музыка перевода. Избранные работы за 2010 г.Музыка перевода. Избранные работы за 2010 г.
Музыка перевода. Избранные работы за 2010 г.
 
Digital Workplace by Lizard Soft
Digital Workplace by Lizard SoftDigital Workplace by Lizard Soft
Digital Workplace by Lizard Soft
 
Prelim R&P of Similar Products
Prelim R&P of Similar ProductsPrelim R&P of Similar Products
Prelim R&P of Similar Products
 
How to look for journal articles using ebsco host_1010S
How to look for journal articles using ebsco host_1010SHow to look for journal articles using ebsco host_1010S
How to look for journal articles using ebsco host_1010S
 
Create an Assignment in MBC
Create an Assignment in MBCCreate an Assignment in MBC
Create an Assignment in MBC
 
GERMANY
GERMANYGERMANY
GERMANY
 
Research Into Target Audience
Research Into Target AudienceResearch Into Target Audience
Research Into Target Audience
 
testmakt
testmakttestmakt
testmakt
 
Finding the meaning in meaningful use
Finding the meaning in meaningful useFinding the meaning in meaningful use
Finding the meaning in meaningful use
 

Similar a Introduction to CART Decision Tree Analysis for Data Mining and Predictive Modeling

Data Mining in Market Research
Data Mining in Market ResearchData Mining in Market Research
Data Mining in Market Researchbutest
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Researchkevinlan
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Researchjim
 
A Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of DiseasesA Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of Diseasesijsrd.com
 
A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...Yao Wu
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmPalin analytics
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 
measure of dispersion
measure of dispersion measure of dispersion
measure of dispersion som allul
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)Abhimanyu Dwivedi
 
On cascading small decision trees
On cascading small decision treesOn cascading small decision trees
On cascading small decision treesJulià Minguillón
 
Tree net and_randomforests_2009
Tree net and_randomforests_2009Tree net and_randomforests_2009
Tree net and_randomforests_2009Matthew Magistrado
 

Similar a Introduction to CART Decision Tree Analysis for Data Mining and Predictive Modeling (20)

CART Training 1999
CART Training 1999CART Training 1999
CART Training 1999
 
Data Mining in Market Research
Data Mining in Market ResearchData Mining in Market Research
Data Mining in Market Research
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
 
A Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of DiseasesA Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of Diseases
 
A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...
 
Statistics
StatisticsStatistics
Statistics
 
Classification
ClassificationClassification
Classification
 
Classification
ClassificationClassification
Classification
 
decisiontrees (3).ppt
decisiontrees (3).pptdecisiontrees (3).ppt
decisiontrees (3).ppt
 
decisiontrees.ppt
decisiontrees.pptdecisiontrees.ppt
decisiontrees.ppt
 
decisiontrees.ppt
decisiontrees.pptdecisiontrees.ppt
decisiontrees.ppt
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning Algorithm
 
6238578.ppt
6238578.ppt6238578.ppt
6238578.ppt
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
measure of dispersion
measure of dispersion measure of dispersion
measure of dispersion
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
On cascading small decision trees
On cascading small decision treesOn cascading small decision trees
On cascading small decision trees
 
Introduction to cart_2007
Introduction to cart_2007Introduction to cart_2007
Introduction to cart_2007
 
Tree net and_randomforests_2009
Tree net and_randomforests_2009Tree net and_randomforests_2009
Tree net and_randomforests_2009
 

Último

Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 

Último (20)

Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 

Introduction to CART Decision Tree Analysis for Data Mining and Predictive Modeling

  • 1. Data Mining with Decision Trees: An Introduction to CART ® Dan Steinberg Mikhail Golovnya [email_address] http://www.salford-systems.com
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16. General Workflow Stage 1 Stage 2 Stage 3 Historical Data Learn Test Validate Build a Sequence of Nested Trees Monitor Performance Best Confirm Findings
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.