Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
By: Abdelfattah Al Zaqqa PSUT-AmmanJordan
Agenda







Introduction
AQ
ID3
C4.5
ILA

Al Zaqqa-PSUT
Introduction-Machine Learning




Machine learning is a branch of artificial
intelligence, concerns the construction and...
What is Decision Tree?


A decision tree is a tree in
which each branch node
represents a choice between a
number of alte...
Introduction


ID3 (Iterative Dichotomiser 3) is an algorithm
invented by Ross Quinlan used to generate a
decision tree f...
ID3 basics








ID3 employs Top_Down Induction of Decision
Tree (greedy algorithm)
Attribute selection is the funda...
Entropy


Entropy H(S) is a measure of the amount of
uncertainty in the (data) set S

More uniform More information we c...
Entropy

set S
Positive

Negative

Entropy(S)= - P(positive)log2P(positive) P(negative)log2P(negative)
Al Zaqqa-PSUT
Information Gain


is the measure of the difference in entropy
from before to after the set is split on an
attribute .

A...
Example
Outlook

Temperature

Humidity

Wind

Play ball

Sunny

Hot

High

Sunny

Hot

High

Overcast Hot

High

Weak

Yes...
Example-Dataset Elements
Outlook

Temperature

Humidity

Wind

Play ball

Sunny

Hot

High

Sunny

Hot

High

Overcast Hot...
Example-Dataset Elements
Outlook

Temperature

Humidity

Wind

Play ball

Sunny

High

Sunny

Hot

High

Overcast Hot

Att...
ID3 Algorithm
1.

Compute Entropy(S) =
-(9/14)log2(9/14)-(5/14)log2(5/14)=0.940

2.

Compute information gain for each
att...
ID3 Algorithm
3.

Select attribute with the maximum
information gain for splitting:





Gain(S, Windy)=0.048
Gain(S, ...
ID3 Algorithm
4.

Apply ID3 to each child
node of this root, until leaf
node or node that has
entropy=0 are reached.

Al Z...
C4.5


C4.5 is an extension of Quinlan's earlier ID3
algorithm.
 Handling

both continuous and discrete attributes.
 Ha...
Continuous-valued attributes
Outlook

Temperature

Humidity

Wind

Play ball

Sunny

Hot

0.9

Sunny

Hot

0.87

Overcast ...
Continuous-valued attributes
Humidity

Play ball

0.9

No

0.87

1. sort the numeric attribute values,
2. Identify adjacen...
Continuous-valued attributes

Al Zaqqa-PSUT
Overfitting

“Under fitting”

“Just right”

“Over fitting”

Overfitting: If we have too many attributes(features) the
lear...
Overfitting

Al Zaqqa-PSUT
Overfitting

Al Zaqqa-PSUT
Why overfitting happens?


Presence of error in the
training examples. (In
general in machine learning).



When small n...
Reduce Overfitting


Stop growing the tree earlier, before it
reaches the point where it perfectly
classifies the trainin...
Rule post-pruning
(Outlook = Sunny " Humidity = Normal)  P
(Outlook = Sunny " Humidity = High)  N
(Outlook = Overcast) ...
Rule post-pruning
•Prune preconditions
Outlook

Temp

Humidity

Wind

Tennis

Rain

Low

High

Weak

No

Rain

Hot

High

...
Rule post-pruning
•Prune preconditions
Outlook

Temp

Humidity

Wind

Tennis

Rain

Low

High

Weak

No

Rain

Hot

High

...
Rule post-pruning


Validation set
 Save a portion of the data for validation
Training set
s

Validation set

Test set
...
Missing information


Example: Missing information in mammograph
data
BI-RAD Age

shape

Margin

Density Class

4

48

4
...
Missing information-according to
most common


Fill in the data according to most common
(given class)
BI-RAD Age

shape
...
Missing information-according to
proportions
Fraction

BI-RAD

Age

shape

Margin

Density

Class

0.75

4

48

4

5

3

1...
Summery




ID3, C4.5 :used to generate a decision tree
developed by Ross Quinlan typically used in the
machine learning...
Id3,c4.5 algorithim
Próxima SlideShare
Cargando en…5
×

Id3,c4.5 algorithim

9.129 visualizaciones

Publicado el

ID3, C4.5 :used to generate a decision tree developed by Ross Quinlan typically used in the machine learning and natural language processing domains, overview about these algorithms with illustrated examples

Publicado en: Educación, Tecnología
  • Sé el primero en comentar

Id3,c4.5 algorithim

  1. 1. By: Abdelfattah Al Zaqqa PSUT-AmmanJordan
  2. 2. Agenda      Introduction AQ ID3 C4.5 ILA Al Zaqqa-PSUT
  3. 3. Introduction-Machine Learning   Machine learning is a branch of artificial intelligence, concerns the construction and study of systems that can learn from data. Machine learning and Data mining  Machine  Known  Data learning prediction properties learned from the training data. mining discovery  Previously  With unknown properties in the data. overlapping Al Zaqqa-PSUT
  4. 4. What is Decision Tree?  A decision tree is a tree in which each branch node represents a choice between a number of alternatives, and each leaf node represents a decision. Al Zaqqa-PSUT Root Node: Attribute Edges: Attribute Value Leaf Node: output, class or decision
  5. 5. Introduction  ID3 (Iterative Dichotomiser 3) is an algorithm invented by Ross Quinlan used to generate a decision tree from a dataset using Shannon Entropy.  Typically used in the machine learning and natural language processing domains. Al Zaqqa-PSUT
  6. 6. ID3 basics     ID3 employs Top_Down Induction of Decision Tree (greedy algorithm) Attribute selection is the fundamental step to construct a decision tree. Select which attribute will be selected to become a node of the decision tree and so on. There are two terms Entropy and Information Gain is used to process attribute selection. Al Zaqqa-PSUT
  7. 7. Entropy  Entropy H(S) is a measure of the amount of uncertainty in the (data) set S More uniform More information we can gain More entropy More information we can gain Al Zaqqa-PSUT
  8. 8. Entropy set S Positive Negative Entropy(S)= - P(positive)log2P(positive) P(negative)log2P(negative) Al Zaqqa-PSUT
  9. 9. Information Gain  is the measure of the difference in entropy from before to after the set is split on an attribute . Al Zaqqa-PSUT
  10. 10. Example Outlook Temperature Humidity Wind Play ball Sunny Hot High Sunny Hot High Overcast Hot High Weak Yes Mild High Weak Yes Yes Rain Weak No Strong No Rain Cool Normal Weak Rain Cool Normal Strong No Cool Normal Strong Overcast Sunny Mild High Sunny Cool Weak Yes No Sunny Overcast Yes Mild Normal Weak Yes Mild Rain Normal Weak Normal Overcast Hot Rain Yes Strong Mild High Strong Yes Normal Weak Mild High Yes Strong No Total Al Zaqqa-PSUT 14
  11. 11. Example-Dataset Elements Outlook Temperature Humidity Wind Play ball Sunny Hot High Sunny Hot High Overcast Hot High Weak Yes Mild High Weak Yes Yes Rain Weak No Strong No Rain Cool Normal Weak Rain Cool Normal Strong No Cool Normal Strong Overcast Sunny Mild High Sunny Cool Weak Yes No Sunny Overcast Yes Mild Normal Weak Yes Mild Rain Normal Weak Normal Overcast Hot Rain Yes Strong Mild High Strong Yes Normal Weak Mild High Yes Strong No Total Collection (S) All the records in the table refer as Collection (S). 14 Al Zaqqa-PSUT
  12. 12. Example-Dataset Elements Outlook Temperature Humidity Wind Play ball Sunny High Sunny Hot High Overcast Hot Attributes Hot High Weak Yes Mild High Weak Yes Yes Rain Weak No Strong No Rain Cool Normal Weak Rain Cool Normal Strong Cool Normal Strong Yes Overcast Sunny Mild High Sunny Cool No Weak No Overcast Normal Weak Yes Mild Sunny Yes Mild Rain Normal Weak Normal Mild High Overcast Hot Rain Strong Yes Strong Yes Normal Weak Strong No Total Class(C) or Classifier:Play ball Mild High Yes 14 Because based on Outlook, Temperature, Humidity and Wind we need to decide whether we can Play ball or not, that’s why Play ball is a classifier to make decision. Al Zaqqa-PSUT
  13. 13. ID3 Algorithm 1. Compute Entropy(S) = -(9/14)log2(9/14)-(5/14)log2(5/14)=0.940 2. Compute information gain for each attribute:  Gain(S,Windy) = Entropy(S)(8/14)Entropy(Sfalse) -(6/14)Entropy(Strue) =0.048 Windy: Weak=8(6+,2-), Strong=6(3+,3-) • Entropy(Sfalse)=-6/8Log2(6/8)-2/8Log2(2/8)=0.811 • Entropy(Strue) =-3/6Log2(3/6)-3/6Log2(3/6)=1 Gain(S,Windy) = 0.940-(8/14)(0.811)-(6/14)(1)=0.048 Al Zaqqa-PSUT
  14. 14. ID3 Algorithm 3. Select attribute with the maximum information gain for splitting:     Gain(S, Windy)=0.048 Gain(S, Humidity) =0.151 Gain(S, Temperature)=0.029 Gain(S, Outlook) = 0.246 Al Zaqqa-PSUT
  15. 15. ID3 Algorithm 4. Apply ID3 to each child node of this root, until leaf node or node that has entropy=0 are reached. Al Zaqqa-PSUT
  16. 16. C4.5  C4.5 is an extension of Quinlan's earlier ID3 algorithm.  Handling both continuous and discrete attributes.  Handling training data with missing attribute values  Pruning trees after creation. Al Zaqqa-PSUT
  17. 17. Continuous-valued attributes Outlook Temperature Humidity Wind Play ball Sunny Hot 0.9 Sunny Hot 0.87 Overcast Hot 0.93 Weak Yes 0.89 Weak Yes Weak Yes Rain Mild Weak No Strong No Rain Cool 0.80 Rain Cool 0.59 Strong No Cool 0.77 Strong Overcast Sunny Mild Yes Weak 0.68 Weak Yes Mild 0.84 Weak Yes Mild Sunny 0.91 0.72 Strong Yes Mild 0.49 Strong Yes Cool Rain Sunny Overcast Overcast Hot Rain 0.74 Mild No Weak 0.86 Yes Strong No Total Al Zaqqa-PSUT 14
  18. 18. Continuous-valued attributes Humidity Play ball 0.9 No 0.87 1. sort the numeric attribute values, 2. Identify adjacent examples that differ in their target classification to pick the threshold. No 0.93 Yes 0.89 Yes 0.80 Yes 0.59 0.68 0.72 0.87 0.9 0.91 Humidity yes yes no no no No 0.77 0.91 Humidity Yes No 0.68 Yes 0.84 Yes 0.72 Yes 0.49 Yes 0.74 Humidity>(0.72+0.87)/2 Humidity>0.795 Yes 0.86 No Al Zaqqa-PSUT
  19. 19. Continuous-valued attributes Al Zaqqa-PSUT
  20. 20. Overfitting “Under fitting” “Just right” “Over fitting” Overfitting: If we have too many attributes(features) the learned hypothesis may fit the training set very well, but fail to generalize to new examples (Predict price on new examples). Al Zaqqa-PSUT
  21. 21. Overfitting Al Zaqqa-PSUT
  22. 22. Overfitting Al Zaqqa-PSUT
  23. 23. Why overfitting happens?  Presence of error in the training examples. (In general in machine learning).  When small numbers of examples are associated with leaf node. Al Zaqqa-PSUT
  24. 24. Reduce Overfitting  Stop growing the tree earlier, before it reaches the point where it perfectly classifies the training data. (difficult)  Allow the tree to overfit the data, and then post-prune the tree. Al Zaqqa-PSUT
  25. 25. Rule post-pruning (Outlook = Sunny " Humidity = Normal)  P (Outlook = Sunny " Humidity = High)  N (Outlook = Overcast)  P (Outlook = Rain " Wind = Strong)  N (Outlook = Rain " Wind = Weak)  P Al Zaqqa-PSUT
  26. 26. Rule post-pruning •Prune preconditions Outlook Temp Humidity Wind Tennis Rain Low High Weak No Rain Hot High Strong No (Outlook = Sunny " Humidity = High)  N (Outlook = Sunny " Humidity = Normal)  P (Outlook = Overcast)  P (Outlook = Rain " Wind = Strong)  N (Outlook = Rain " Wind = Weak)  P Al Zaqqa-PSUT
  27. 27. Rule post-pruning •Prune preconditions Outlook Temp Humidity Wind Tennis Rain Low High Weak No Rain Hot High Strong No (Outlook = Sunny " Humidity = High)  N (Outlook = Sunny " Humidity = Normal)  P (Outlook = Overcast)  P (Outlook = Rain)  N (Outlook = Rain " Wind = Weak)  P New instances Outlook Humidity Wind Tennis Sunny Low Low Weak yes Rain Al Zaqqa-PSUT Temp Hot High Weak No
  28. 28. Rule post-pruning  Validation set  Save a portion of the data for validation Training set s Validation set Test set <= t, prune subtree  {s validation performance with subtree at node, t validation set performance with leaf instead of subtree) Rule post-pruning (Quinlan 1993)  Can remove smaller elements than whole subtrees  Improved readability  Reduced-error pruning (Quinlan 1987) …  Al Zaqqa-PSUT
  29. 29. Missing information  Example: Missing information in mammograph data BI-RAD Age shape Margin Density Class 4 48 4 5 ? 1 5 67 3 5 3 1 5 57 4 4 3 1 5 60 ? 5 1 1 4 53 ? 4 3 1 4 28 1 1 3 0 4 70 ? 2 3 0 2 66 1 1 ? 0 5 63 3 ? 3 0 4 78 1 1 1 0 Al Zaqqa-PSUT
  30. 30. Missing information-according to most common  Fill in the data according to most common (given class) BI-RAD Age shape Margin Density Class 4 48 4 5 3 1 5 67 3 5 3 1 5 57 4 4 3 1 5 60 4 5 1 1 4 53 4 4 3 1 4 28 1 1 3 0 4 70 1 2 3 0 2 66 1 1 3 0 5 63 3 ? 3 0 4 78 1 1 0 Al Zaqqa-PSUT 1
  31. 31. Missing information-according to proportions Fraction BI-RAD Age shape Margin Density Class 0.75 4 48 4 5 3 1 0.25 4 48 4 5 1 1 1 5 67 3 5 3 1 1 5 57 4 4 3 1 0.66 5 60 4 5 1 1 0.33 5 60 3 5 1 1 0.66 4 53 4 4 3 1 0.33 4 53 3 4 3 1 1 4 28 1 1 3 0 0.75 4 70 1 2 3 0 0.25 4 70 3 2 3 0 0.25 2 66 1 1 1 0 0.75 2 66 1 1 3 0 0.75 5 63 3 1 3 0 0.25 5 63 3 2 3 0 1 4 78 1 1 1 0 Al Zaqqa-PSUT 33/4 11/4
  32. 32. Summery   ID3, C4.5 :used to generate a decision tree developed by Ross Quinlan typically used in the machine learning and natural language processing domains ID3, C4.5: uses the entropy of an attribute and picks the attribute with the highest reduction in entropy to determine which attribute should the data be split with first and then through a series of recursive functions that calculate the entropy of the node the process is continued until all the left nodes are pure. Al Zaqqa-PSUT

×