SlideShare a Scribd company logo
1 of 4
Download to read offline
Machine Learning with ID3 Decision
Trees in Ruby

Table of contents
 1 Introduction to ID3 algorithm............................................................................................ 2
 2 Marketing target strategy example using ID3 Decision Trees in Ruby.............................2
 3 Better data loading............................................................................................................. 3
 4 A good tip for data evaluation............................................................................................3
 5 More about ID3 and decision trees.................................................................................... 4




                     Copyright © 2007 Sergio Fierens, Jade Ferret All rights reserved.
Machine Learning with ID3 Decision Trees in Ruby




1. Introduction to ID3 algorithm
AI4R implements the ID3 algorithm (Quinlan) as one of its automatic classifiers. Given a set
of preclassified examples, it builds a top-down induction of decision tree, biased by the
information gain and entropy measure.
The good thing about this automatic learning method is that humans learns as well. Unlike
other AI techniques like neural networks, classifiers can generate ruby code with if / else
sentences. You can use this to evaluate parameters on realtime, copy paste them in a code, or
just read them to learn about your problem domain.

2. Marketing target strategy example using ID3 Decision Trees in Ruby
Let's suppose that you are writting an application that must identify people as relevant
marketing targets or not. The only information that you have is a collection of examples,
provided by a marketing survey:

  DATA_LABELS = [ 'city', 'age_range', 'gender', 'marketing_target'                           ]
  DATA_SET = [
           ['New York',        '<30',          'M',    'Y'],
           ['Chicago',         '<30',          'M',    'Y'],
           ['Chicago',         '<30',          'F',    'Y'],
           ['New York',        '<30',          'M',    'Y'],
           ['New York',        '<30',          'M',    'Y'],
           ['Chicago',         '[30-50)',      'M',    'Y'],
           ['New York',        '[30-50)',      'F',    'N'],
           ['Chicago',         '[30-50)',      'F',    'Y'],
           ['New York',        '[30-50)',      'F',    'N'],
           ['Chicago',         '[50-80]',      'M',    'N'],
           ['New York',        '[50-80]',      'F',    'N'],
           ['New York',        '[50-80]',      'M',    'N'],
           ['Chicago',         '[50-80]',      'M',    'N'],
           ['New York',        '[50-80]',      'F',    'N'],
           ['Chicago',         '>80',          'F',    'Y']
         ]

You can create an ID3 Decision tree to do the dirty job for you:

  id3 = ID3.new(DATA_SET, DATA_LABELS)

The Decision tree will automatically create the "rules" to parse new data, and identify new
posible marketing targets:



                                                                                                  Page 2
                Copyright © 2007 Sergio Fierens, Jade Ferret All rights reserved.
Machine Learning with ID3 Decision Trees in Ruby




  id3.get_rules
    # => if age_range=='<30' then marketing_target='Y'
          elsif age_range=='[30-50)' and city=='Chicago' then
marketing_target='Y'
          elsif age_range=='[30-50)' and city=='New York' then
marketing_target='N'
          elsif age_range=='[50-80]' then marketing_target='N'
          elsif age_range=='>80' then marketing_target='Y'
          else raise 'There was not enough information during training to
do a proper induction for this data element' end
   id3.eval(['New York', '<30', 'M'])
     # => 'Y'


3. Better data loading
In real life you will use many more data training examples, with more attributes. Consider
moving your data to an external CSV (comma separate values) file.

  data_set = []
  CSV::Reader.parse(File.open("#{File.dirname(__FILE__)}/data_set.csv",
'r')) do |row|
    data_set << row
  end
  data_labels = data_set.shift
   id3 = ID3.new(data_set, data_labels)

4. A good tip for data evaluation
The ID3 class provides a method to evaluate new data.

id3.eval(['New York', '<30', 'M'])
    # => 'Y'
But instead of going through the tree every time, you can take advantage of the fact that the
method "get_rules" generates proper ruby code!

   id3 = ID3.new(DATA_SET, DATA_LABELS)
     age_range = '<30'
     city = 'New York'
     gender = 'M'
     marketing_target = nil
     eval id3.get_rules
     puts marketing_target


Page 3
                    Copyright © 2007 Sergio Fierens, Jade Ferret All rights reserved.
Machine Learning with ID3 Decision Trees in Ruby




       # =>   'Y'

5. More about ID3 and decision trees
Wikipedia article on Decision trees Wikipedia article on ID3 Algorithm




                                                                                                 Page 4
                Copyright © 2007 Sergio Fierens, Jade Ferret All rights reserved.

More Related Content

More from butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

More from butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

Machine Learning with ID3 Decision Trees in Ruby

  • 1. Machine Learning with ID3 Decision Trees in Ruby Table of contents 1 Introduction to ID3 algorithm............................................................................................ 2 2 Marketing target strategy example using ID3 Decision Trees in Ruby.............................2 3 Better data loading............................................................................................................. 3 4 A good tip for data evaluation............................................................................................3 5 More about ID3 and decision trees.................................................................................... 4 Copyright © 2007 Sergio Fierens, Jade Ferret All rights reserved.
  • 2. Machine Learning with ID3 Decision Trees in Ruby 1. Introduction to ID3 algorithm AI4R implements the ID3 algorithm (Quinlan) as one of its automatic classifiers. Given a set of preclassified examples, it builds a top-down induction of decision tree, biased by the information gain and entropy measure. The good thing about this automatic learning method is that humans learns as well. Unlike other AI techniques like neural networks, classifiers can generate ruby code with if / else sentences. You can use this to evaluate parameters on realtime, copy paste them in a code, or just read them to learn about your problem domain. 2. Marketing target strategy example using ID3 Decision Trees in Ruby Let's suppose that you are writting an application that must identify people as relevant marketing targets or not. The only information that you have is a collection of examples, provided by a marketing survey: DATA_LABELS = [ 'city', 'age_range', 'gender', 'marketing_target' ] DATA_SET = [ ['New York', '<30', 'M', 'Y'], ['Chicago', '<30', 'M', 'Y'], ['Chicago', '<30', 'F', 'Y'], ['New York', '<30', 'M', 'Y'], ['New York', '<30', 'M', 'Y'], ['Chicago', '[30-50)', 'M', 'Y'], ['New York', '[30-50)', 'F', 'N'], ['Chicago', '[30-50)', 'F', 'Y'], ['New York', '[30-50)', 'F', 'N'], ['Chicago', '[50-80]', 'M', 'N'], ['New York', '[50-80]', 'F', 'N'], ['New York', '[50-80]', 'M', 'N'], ['Chicago', '[50-80]', 'M', 'N'], ['New York', '[50-80]', 'F', 'N'], ['Chicago', '>80', 'F', 'Y'] ] You can create an ID3 Decision tree to do the dirty job for you: id3 = ID3.new(DATA_SET, DATA_LABELS) The Decision tree will automatically create the "rules" to parse new data, and identify new posible marketing targets: Page 2 Copyright © 2007 Sergio Fierens, Jade Ferret All rights reserved.
  • 3. Machine Learning with ID3 Decision Trees in Ruby id3.get_rules # => if age_range=='<30' then marketing_target='Y' elsif age_range=='[30-50)' and city=='Chicago' then marketing_target='Y' elsif age_range=='[30-50)' and city=='New York' then marketing_target='N' elsif age_range=='[50-80]' then marketing_target='N' elsif age_range=='>80' then marketing_target='Y' else raise 'There was not enough information during training to do a proper induction for this data element' end id3.eval(['New York', '<30', 'M']) # => 'Y' 3. Better data loading In real life you will use many more data training examples, with more attributes. Consider moving your data to an external CSV (comma separate values) file. data_set = [] CSV::Reader.parse(File.open("#{File.dirname(__FILE__)}/data_set.csv", 'r')) do |row| data_set << row end data_labels = data_set.shift id3 = ID3.new(data_set, data_labels) 4. A good tip for data evaluation The ID3 class provides a method to evaluate new data. id3.eval(['New York', '<30', 'M']) # => 'Y' But instead of going through the tree every time, you can take advantage of the fact that the method "get_rules" generates proper ruby code! id3 = ID3.new(DATA_SET, DATA_LABELS) age_range = '<30' city = 'New York' gender = 'M' marketing_target = nil eval id3.get_rules puts marketing_target Page 3 Copyright © 2007 Sergio Fierens, Jade Ferret All rights reserved.
  • 4. Machine Learning with ID3 Decision Trees in Ruby # => 'Y' 5. More about ID3 and decision trees Wikipedia article on Decision trees Wikipedia article on ID3 Algorithm Page 4 Copyright © 2007 Sergio Fierens, Jade Ferret All rights reserved.