SlideShare una empresa de Scribd logo
1 de 16
 Data-Applied.com: Decision
Introduction Decision trees let you construct decision models They can be used for forecasting, classification or decision At each branch the data is spit based on a particular field of data Decision trees are constructed using Divide and Conquer techniques
Divide-and-Conquer: Constructing Decision Trees Steps to construct a decision tree recursively: Select an attribute to placed at root node and make one branch for each possible value  Repeat the process recursively at each branch, using only those instances that reach the branch  If at any time all instances at a node have the classification, stop developing that part of the tree Problem: How to decide which attribute to split on
Divide-and-Conquer: Constructing Decision Trees Steps to find the attribute to split on: We consider all the possible attributes as option and branch them according to different possible values Now for each possible attribute value we calculate Information and then find the Information gain for each attribute option Select that attribute for division which gives a Maximum Information Gain Do this until each branch terminates at an attribute which gives Information = 0
Divide-and-Conquer: Constructing Decision Trees Calculation of Information and Gain: For data: (P1, P2, P3……Pn) such that P1 + P2 + P3 +……. +Pn = 1  Information(P1, P2 …..Pn)  =  -P1logP1 -P2logP2 – P3logP3 ……… -PnlogPn Gain  = Information before division – Information after division
Divide-and-Conquer: Constructing Decision Trees Example: Here we have consider each attribute individually Each is divided into branches  according to different possible  values  Below each branch the number of class is marked
Divide-and-Conquer: Constructing Decision Trees Calculations: Using the formulae for Information, initially we have Number of instances with class = Yes is 9  Number of instances with class = No is 5 So we have P1 = 9/14 and P2 = 5/14 Info[9/14, 5/14] = -9/14log(9/14) -5/14log(5/14) = 0.940 bits Now for example lets consider Outlook attribute, we observe the following:
Divide-and-Conquer: Constructing Decision Trees Example Contd. Gain by using Outlook for division        = info([9,5]) – info([2,3],[4,0],[3,2]) 				                          = 0.940 – 0.693 = 0.247 bits Gain (outlook) = 0.247 bits 	Gain (temperature) = 0.029 bits 	Gain (humidity) = 0.152 bits 	Gain (windy) = 0.048 bits So since Outlook gives maximum gain, we will use it for division And we repeat the steps for Outlook = Sunny and Rainy and stop for 	Overcast since we have Information = 0 for it
Divide-and-Conquer: Constructing Decision Trees Highly branching attributes: The problem If we follow the previously subscribed method, it will always favor an attribute with the largest number of  branches In extreme cases it will favor an attribute which has different value for each instance: Identification code
Divide-and-Conquer: Constructing Decision Trees Highly branching attributes: The problem Information for such an attribute is 0 info([0,1]) + info([0,1]) + info([0,1]) + …………. + info([0,1]) = 0 It will hence have the maximum gain and will be chosen for branching But such an attribute is not good for predicting class of an unknown instance nor does it tells anything about the structure of division So we use gain ratio to compensate for this
Divide-and-Conquer: Constructing Decision Trees Highly branching attributes: Gain ratio Gain ratio =  gain/split info To calculate split info, for each instance value we just consider the number of instances covered by each attribute value, irrespective of the class Then we calculate the split info, so for identification code with 14 different values we have: info([1,1,1,…..,1]) = -1/14 x log1/14 x 14 = 3.807 For Outlook we will have the split info: info([5,4,5]) =  -1/5 x log 1/5 -1/4 x log1/4 -1/5 x log 1/5  = 1.577
Decision using Data Applied’s web interface
Step1: Selection of data
Step2: SelectingDecision
Step3: Result
Visit more self help tutorials ,[object Object]

Más contenido relacionado

La actualidad más candente

La actualidad más candente (11)

Decision tree
Decision treeDecision tree
Decision tree
 
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
 
Decision tree and random forest
Decision tree and random forestDecision tree and random forest
Decision tree and random forest
 
WEKA: Practical Machine Learning Tools And Techniques
WEKA: Practical Machine Learning Tools And TechniquesWEKA: Practical Machine Learning Tools And Techniques
WEKA: Practical Machine Learning Tools And Techniques
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Fuzzy c means_realestate_application
Fuzzy c means_realestate_applicationFuzzy c means_realestate_application
Fuzzy c means_realestate_application
 
Fuzzy c means manual work
Fuzzy c means manual workFuzzy c means manual work
Fuzzy c means manual work
 
ID3 Algorithm & ROC Analysis
ID3 Algorithm & ROC AnalysisID3 Algorithm & ROC Analysis
ID3 Algorithm & ROC Analysis
 
Rough K Means - Numerical Example
Rough K Means - Numerical ExampleRough K Means - Numerical Example
Rough K Means - Numerical Example
 
Image Compression
Image CompressionImage Compression
Image Compression
 
k Nearest Neighbor
k Nearest Neighbork Nearest Neighbor
k Nearest Neighbor
 

Destacado (8)

Data Applied:Outliers
Data Applied:OutliersData Applied:Outliers
Data Applied:Outliers
 
Data Applied: Clustering
Data Applied: ClusteringData Applied: Clustering
Data Applied: Clustering
 
Data Applied: Correlation
Data Applied: CorrelationData Applied: Correlation
Data Applied: Correlation
 
Data Applied: Association
Data Applied: AssociationData Applied: Association
Data Applied: Association
 
Data Applied: Forecast
Data Applied: ForecastData Applied: Forecast
Data Applied: Forecast
 
Data Applied:Tree Maps
Data Applied:Tree MapsData Applied:Tree Maps
Data Applied:Tree Maps
 
Data Applied:Similarity
Data Applied:SimilarityData Applied:Similarity
Data Applied:Similarity
 
Data Applied:Tree Maps
Data Applied:Tree MapsData Applied:Tree Maps
Data Applied:Tree Maps
 

Similar a Data Applied: Decision

WEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic MethodsWEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic Methodsweka Content
 
DM Unit-III ppt.ppt
DM Unit-III ppt.pptDM Unit-III ppt.ppt
DM Unit-III ppt.pptLaxmi139487
 
Machine learning session 10
Machine learning session 10Machine learning session 10
Machine learning session 10NirsandhG
 
unit 5 decision tree2.pptx
unit 5 decision tree2.pptxunit 5 decision tree2.pptx
unit 5 decision tree2.pptxssuser5c580e1
 
Know How to Create and Visualize a Decision Tree with Python.pdf
Know How to Create and Visualize a Decision Tree with Python.pdfKnow How to Create and Visualize a Decision Tree with Python.pdf
Know How to Create and Visualize a Decision Tree with Python.pdfData Science Council of America
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision treeKrish_ver2
 
An algorithm for building
An algorithm for buildingAn algorithm for building
An algorithm for buildingajmal_fuuast
 
Classification (ML).ppt
Classification (ML).pptClassification (ML).ppt
Classification (ML).pptrajasamal1999
 
WEKA:Practical Machine Learning Tools And Techniques
WEKA:Practical Machine Learning Tools And TechniquesWEKA:Practical Machine Learning Tools And Techniques
WEKA:Practical Machine Learning Tools And Techniquesweka Content
 
Cs501 classification prediction
Cs501 classification predictionCs501 classification prediction
Cs501 classification predictionKamal Singh Lodhi
 
Oracle Fusion Trees
Oracle Fusion TreesOracle Fusion Trees
Oracle Fusion TreesFeras Ahmad
 
Tutorial ground classification with Laserdata LiS
Tutorial ground classification with Laserdata LiSTutorial ground classification with Laserdata LiS
Tutorial ground classification with Laserdata LiSFrederic Petrini-Monteferri
 
weka-190429184259.pdf
weka-190429184259.pdfweka-190429184259.pdf
weka-190429184259.pdfTeamRebel1
 
Weka presentation
Weka presentationWeka presentation
Weka presentationAbrar ali
 

Similar a Data Applied: Decision (20)

WEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic MethodsWEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic Methods
 
DM Unit-III ppt.ppt
DM Unit-III ppt.pptDM Unit-III ppt.ppt
DM Unit-III ppt.ppt
 
Machine learning session 10
Machine learning session 10Machine learning session 10
Machine learning session 10
 
unit 5 decision tree2.pptx
unit 5 decision tree2.pptxunit 5 decision tree2.pptx
unit 5 decision tree2.pptx
 
Know How to Create and Visualize a Decision Tree with Python.pdf
Know How to Create and Visualize a Decision Tree with Python.pdfKnow How to Create and Visualize a Decision Tree with Python.pdf
Know How to Create and Visualize a Decision Tree with Python.pdf
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
Dbm630 lecture06
Dbm630 lecture06Dbm630 lecture06
Dbm630 lecture06
 
An algorithm for building
An algorithm for buildingAn algorithm for building
An algorithm for building
 
Data mining
Data miningData mining
Data mining
 
Classification (ML).ppt
Classification (ML).pptClassification (ML).ppt
Classification (ML).ppt
 
WEKA:Practical Machine Learning Tools And Techniques
WEKA:Practical Machine Learning Tools And TechniquesWEKA:Practical Machine Learning Tools And Techniques
WEKA:Practical Machine Learning Tools And Techniques
 
Cs501 classification prediction
Cs501 classification predictionCs501 classification prediction
Cs501 classification prediction
 
Oracle Fusion Trees
Oracle Fusion TreesOracle Fusion Trees
Oracle Fusion Trees
 
Data-Mining
Data-MiningData-Mining
Data-Mining
 
Decision tree
Decision treeDecision tree
Decision tree
 
Tutorial ground classification with Laserdata LiS
Tutorial ground classification with Laserdata LiSTutorial ground classification with Laserdata LiS
Tutorial ground classification with Laserdata LiS
 
weka-190429184259.pdf
weka-190429184259.pdfweka-190429184259.pdf
weka-190429184259.pdf
 
Weka presentation
Weka presentationWeka presentation
Weka presentation
 
Decision tree
Decision treeDecision tree
Decision tree
 
ML .pptx
ML .pptxML .pptx
ML .pptx
 

Último

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Último (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Data Applied: Decision

  • 2. Introduction Decision trees let you construct decision models They can be used for forecasting, classification or decision At each branch the data is spit based on a particular field of data Decision trees are constructed using Divide and Conquer techniques
  • 3. Divide-and-Conquer: Constructing Decision Trees Steps to construct a decision tree recursively: Select an attribute to placed at root node and make one branch for each possible value Repeat the process recursively at each branch, using only those instances that reach the branch If at any time all instances at a node have the classification, stop developing that part of the tree Problem: How to decide which attribute to split on
  • 4. Divide-and-Conquer: Constructing Decision Trees Steps to find the attribute to split on: We consider all the possible attributes as option and branch them according to different possible values Now for each possible attribute value we calculate Information and then find the Information gain for each attribute option Select that attribute for division which gives a Maximum Information Gain Do this until each branch terminates at an attribute which gives Information = 0
  • 5. Divide-and-Conquer: Constructing Decision Trees Calculation of Information and Gain: For data: (P1, P2, P3……Pn) such that P1 + P2 + P3 +……. +Pn = 1 Information(P1, P2 …..Pn) = -P1logP1 -P2logP2 – P3logP3 ……… -PnlogPn Gain = Information before division – Information after division
  • 6. Divide-and-Conquer: Constructing Decision Trees Example: Here we have consider each attribute individually Each is divided into branches according to different possible values Below each branch the number of class is marked
  • 7. Divide-and-Conquer: Constructing Decision Trees Calculations: Using the formulae for Information, initially we have Number of instances with class = Yes is 9 Number of instances with class = No is 5 So we have P1 = 9/14 and P2 = 5/14 Info[9/14, 5/14] = -9/14log(9/14) -5/14log(5/14) = 0.940 bits Now for example lets consider Outlook attribute, we observe the following:
  • 8. Divide-and-Conquer: Constructing Decision Trees Example Contd. Gain by using Outlook for division = info([9,5]) – info([2,3],[4,0],[3,2]) = 0.940 – 0.693 = 0.247 bits Gain (outlook) = 0.247 bits Gain (temperature) = 0.029 bits Gain (humidity) = 0.152 bits Gain (windy) = 0.048 bits So since Outlook gives maximum gain, we will use it for division And we repeat the steps for Outlook = Sunny and Rainy and stop for Overcast since we have Information = 0 for it
  • 9. Divide-and-Conquer: Constructing Decision Trees Highly branching attributes: The problem If we follow the previously subscribed method, it will always favor an attribute with the largest number of branches In extreme cases it will favor an attribute which has different value for each instance: Identification code
  • 10. Divide-and-Conquer: Constructing Decision Trees Highly branching attributes: The problem Information for such an attribute is 0 info([0,1]) + info([0,1]) + info([0,1]) + …………. + info([0,1]) = 0 It will hence have the maximum gain and will be chosen for branching But such an attribute is not good for predicting class of an unknown instance nor does it tells anything about the structure of division So we use gain ratio to compensate for this
  • 11. Divide-and-Conquer: Constructing Decision Trees Highly branching attributes: Gain ratio Gain ratio = gain/split info To calculate split info, for each instance value we just consider the number of instances covered by each attribute value, irrespective of the class Then we calculate the split info, so for identification code with 14 different values we have: info([1,1,1,…..,1]) = -1/14 x log1/14 x 14 = 3.807 For Outlook we will have the split info: info([5,4,5]) = -1/5 x log 1/5 -1/4 x log1/4 -1/5 x log 1/5 = 1.577
  • 12. Decision using Data Applied’s web interface
  • 16.
  • 17. The tutorials section is free, self-guiding and will not involve any additional support.
  • 18. Visit us at www.dataminingtools.net