SlideShare una empresa de Scribd logo
1 de 34
Descargar para leer sin conexión
Machine Learning (CS 567)
                   Fall 2008

          Time: T-Th, 5:00pm - 6:20pm
                Location: GFS118

 Instructor: Sofus A. Macskassy (macskass@usc.edu)
                  Office: SAL 216
          Office hours: by appointment

          Teaching assistant: Cheol Han
                Office hours: TBA


                 Class web page:
     http://www-scf.usc.edu/~csci567/index.html
Course Goals

•   Introduce the background of machine learning
•   Show core ML techniques
•   Show proper experimental ML methodologies
•   Prepare you to be able to use ML tools, contribute
    to the field

Not: Find out details of the state of the art…



Fall 2008                  2       CS 567 Lecture 1 - Sofus A. Macskassy
Course Overview
•   Introduction and Basics
     – Basic problems and questions in Machine Learning. Example applications.
•   Linear Classifiers
     – Perceptron, Logistic Regression, Linear Discriminant Analysis (LDA)
•   Five Popular Algorithms
     –      Decision Trees (C4.5)
     –      Neural Networks (backpropagation)
     –      Probabilistic networks (Naïve bayes)
     –      Nearest Neighbor (k-NN)
     –      Support Vector Machines
•   Theories of Learning
     – PAC, Bayesian, Bias-Variance Analysis
•   Optimizing Test Set Performance
     – Overfitting, Penalty Methods, Holdout Methods, Ensembles
•   Evaluation and Methodologies
     – Statistical tests, comparison of models and algorithms, confidence bounds


Fall 2008                                      3          CS 567 Lecture 1 - Sofus A. Macskassy
Administrative issues
• Class materials
     – Introduction to Machine Learning, Ethem
       Alpaydin, 2004, MIT Press
     – Class notes are meant to complement readings
• Mailing List
     – Please send me (macskass@usc.edu) mail with
       the following subject line:
            cs567 student



Fall 2008                   4     CS 567 Lecture 1 - Sofus A. Macskassy
Administrative issues
• Prerequisites
     – Basic linear algebra and calculus
     – Some knowledge of probabilities and statistics
     – Some AI background is recommended but not
            required
     – The Weka machine learning toolkit will be used
       in assignments. Prior knowledge not required




Fall 2008                  5        CS 567 Lecture 1 - Sofus A. Macskassy
Administrative
• Please stop me if I talk too fast
• Please stop me if you have any questions
• Any feedback is better than no feedback
     – Sooner is better
     – End of the semester is too late to make it easier
       on you




Fall 2008                  6        CS 567 Lecture 1 - Sofus A. Macskassy
Evaluation
•   Weekly quizzes                               10%
•   5 homework assignments                       20%
•   Midterm                                      20%
•   Final Exam (not cumulative)                  20%
•   Class Project                                30%
     – Groups of 1-4 people
     – Research paper to be written
     – Presentations at the last two classes
     – Details later in the course
Fall 2008                  7        CS 567 Lecture 1 - Sofus A. Macskassy
Evaluation
• Grades
• Based on absolute scores
            •A     90.0%
            • A-   87.5%
            • B+   85.0%
            •B     80.0%
            • B-   77.5%
            • C+   75.0%
            •C     70.0%



Fall 2007                      8    CS 567 Lecture 2 - Sofus A. Macskassy
Lecture 1 Outline
•   The what and why of machine learning
•   Why study machine learning
•   Types of machine learning
•   Supervised learning defined




Fall 2008             9      CS 567 Lecture 1 - Sofus A. Macskassy
What is learning?
• H.Simon: Any process by which a system improves
  its performance
• M.Minsky: Learning is making useful changes in
  our minds
• Michalsky: Learning is constructing or modifying
  representations of what is being experiences
• Valiant: Learning is the process of knowledge
  acquisition in the absence of explicit programming

• Generally: A process by which a computer
  algorithm finds an approximate solution to a
  problem

Fall 2008               10       CS 567 Lecture 1 - Sofus A. Macskassy
What is The Learning Problem?
Learning = Improving with experience at some task
• Improve over task T,
• with respect to performance measure P,
• based on experience E.
E.g., Learn to play checkers
• T : Play checkers
• P : % of games won in world tournament
• E : opportunity to play against self



Fall 2008              11       CS 567 Lecture 1 - Sofus A. Macskassy
Compare to Human Learning
• Psychologists use ―learning‖ somewhat
  differently.
• Typically the task changes, whereas in
  machine learning, typically we assume the
  task is stationary.
• Seems like a pretty fundamental difference!




Fall 2008            12      CS 567 Lecture 1 - Sofus A. Macskassy
Why study machine learning?
• Easier to build a learning system than to hand-
  code a working program. E.g.:
     – Robot that learns a map of the environment by
       wandering around it
     – Programs that learn o play games by playing against
       themselves or others
• Improving on existing programs, e.g.
     – Instruction scheduling and register allocation in
       compilers
     – Combinatorial optimization problems
• Discover knowledge and patterns in highly
  dimensional, complex data

Fall 2008                      13         CS 567 Lecture 1 - Sofus A. Macskassy
Why study machine learning?
• Solving tasks that require a system to be adaptive,
  e.g.
     – Speech and handwriting recognition
     – ―Intelligent‖ user interfaces
• Understanding animal and human learning
     – How do we learn language?
     – How do we recognize faces?
• Creating ―real‖ AI!
     – An expert system—brilliantly designed, engineered and
       implemented—cannot learn not to repeat its mistakes.


Fall 2008                    14        CS 567 Lecture 1 - Sofus A. Macskassy
Time is right to study machine learning

•   Recent progress in algorithms and theory
•   Growing flood of online data
•   Computational power is available
•   Growing industry demand




Fall 2008              15     CS 567 Lecture 1 - Sofus A. Macskassy
Three example ML tasks
• Data mining: using historical data to
  improve decisions
     – medical records  medical knowledge
• Software applications we can't program by
  hand
     – autonomous driving
     – speech recognition
• Self customizing programs
     – Newsreader that learns user interests


Fall 2008                   16     CS 567 Lecture 1 - Sofus A. Macskassy
Typical Learning Task
Given:
• 9714 patient records, each describing a pregnancy
  and birth
• Each patient record contains 215 features
Learn to predict:
• Classes of future patients at high risk for
  Emergency Cesarean Section




Fall 2008               17      CS 567 Lecture 1 - Sofus A. Macskassy
Patient Data


Patient103 time=1            Patient103 time=2            Patient103 time=n
Age: 23                      Age: 23                 …    Age: 23
FirstPregnancy: no           FirstPregnancy: no           FirstPregnancy: no
Anemia: no                   Anemia: no                   Anemia: no
Diabetes: no                 Diabetes: YES                Diabetes: no
PreviousPrematureBirth: no   PreviousPrematureBirth: no   PreviousPrematureBirth: no
Ultrasound: ?                Ultrasound: abnormal         Ultrasound: ?
Elective C-Section: ?        Elective C-Section: no       Elective C-Section: no
Emergency C-Section: ?       Emergency C-Section: ?       Emergency C-Section: YES
…                            …                            …




Fall 2008                               18            CS 567 Lecture 1 - Sofus A. Macskassy
Datamining Result

One of 18 learned rules:
     If
        No previous vaginal delivery, and
        Abnormal 2nd Trimester Ultrasound, and
        Malpresentation at admission
     Then
        Probability of Emergency C-Section is 0.6


• Accuracy in training data: 26/41 = .63,
• Accuracy in test data: 12/20 = .60



Fall 2008                          19          CS 567 Lecture 1 - Sofus A. Macskassy
Too Difficult to Program
Problems too difficult to program by hand
ALVINN [Pomerleau] drives 70mph on highways




Fall 2008             20      CS 567 Lecture 1 - Sofus A. Macskassy
Software Customizes to User
http://www.wisewire.com
April 30, 1998
Lycos Acquires WiseWire
By internetnews.com Staff
Lycos, Inc. today announced the
   acquisition of targeted-content
   provider WiseWire Corp. for around
   $39.75 million in Lycos shares.
Under the acquisition, the three-year
   old, Pittsburgh-based WiseWire's
   automated directory will be
   incorporated into Lycos’ search
   page. Lycos said it will be the first
   online service using the technology
   which is based on user input and an
   intelligent agent product.

Fall 2008                          21      CS 567 Lecture 1 - Sofus A. Macskassy
Where Is This Headed?
Today: tip of the iceberg
• First-generation algorithms: neural nets, decision
  trees, regression ...
• Applied to well-formatted database
• Budding industry




Fall 2008                22       CS 567 Lecture 1 - Sofus A. Macskassy
Looking to Tomorrow
Opportunity for tomorrow: enormous impact
• Learn across full mixed-media data
• Learn across multiple internal databases, plus the
  web and newsfeeds
• Learn by active experimentation
• Learn decisions rather than predictions
• Cumulative, lifelong learning
• Programming languages with learning embedded?



Fall 2008               23       CS 567 Lecture 1 - Sofus A. Macskassy
Relevant Disciplines
•   Artificial intelligence
•   Cognitive Science
•   Probability theory and statistics
•   Computational complexity theory
•   Control theory
•   Information theory
•   Philosophy
•   Psychology and neurobiology
•   …
Fall 2008               24     CS 567 Lecture 1 - Sofus A. Macskassy
Important Application Areas
• Bioinformatics: sequence alignment, analyzing micro-array data,
  information integration, …
• Computer vision: object recognition, tracking, segmentation, active
  vision, …
• Robotics: state estimation, map building, decision making
• Graphics: building realistic simulations
• Speech: recognition, speaker identification
• Financial analysis: option pricing, portfolio allocation, …
• E-commerce: automated trading agents, data mining, spam, …
• Medicine: diagnosis, treatment, drug design, …
• Computer games: building adaptive opponents
• Multimedia: retrieval across diverse databases
• Web: information extraction
• Industry: record linkage, social network analysis, anomaly detection,
  tracking entities, …

Fall 2008                        25           CS 567 Lecture 1 - Sofus A. Macskassy
Kinds of Learning
• Based on information available
     – Supervised – true labels provided
     – Reinforcement – Only indirect labels provided (reward/punishment)
     – Unsupervised – No feedback & no labels
• Based on the role of the learner
     – Passive – given a set of data, produce a model
     – Online – given one data point at a time, update model
     – Active – ask for specific data points to improve model
• Based on type of output
     – Concept Learning – Binary output based on +ve/-ve examples
     – Classification – Classifying into one among many classes
     – Regression – Numeric, ordered output



Fall 2008                          26          CS 567 Lecture 1 - Sofus A. Macskassy
Type of Output: Classification


  • Example: Credit
    scoring
  • Differentiating
    between low-risk
    and high-risk
    customers from
    their income and
      savings

        Discriminant: IF income > θ1 AND savings > θ2
                             THEN low-risk ELSE high-risk

Fall 2008                       27        CS 567 Lecture 1 - Sofus A. Macskassy
Classification: Applications
• Aka Pattern recognition
• Face recognition: Pose, lighting, occlusion
  (glasses, beard), make-up, hair style
• Character recognition: Different handwriting
  styles.
• Speech recognition: Temporal dependency.
     – Use of a dictionary or the syntax of the language.
     – Sensor fusion: Combine multiple modalities; eg, visual
       (lip image) and acoustic for speech
• Medical diagnosis: From symptoms to illnesses
• ...

Fall 2008                     28        CS 567 Lecture 1 - Sofus A. Macskassy
Type of Output: Regression
• Example: Price of a
  used car
• x : car attributes
                                    y = wx+w0
  y : price
      y = g (x | θ )
  g ( ) model,
  θ parameters


Fall 2008               29   CS 567 Lecture 1 - Sofus A. Macskassy
Regression Applications
• Navigating a car: Angle of the steering
  wheel (CMU NavLab)
• Kinematics of a robot arm
        (x,y)
                           α1= g1(x,y)
                      α2   α2= g2(x,y)

                 α1

  • Response surface design

Fall 2008                        30      CS 567 Lecture 1 - Sofus A. Macskassy
Supervised Learning: Uses
• Prediction of future cases: Use the rule to
  predict the output for future inputs
• Knowledge extraction: The rule is easy to
  understand
• Compression: The rule is simpler than the
  data it explains
• Outlier detection: Exceptions that are not
  covered by the rule, e.g., fraud


Fall 2008              31     CS 567 Lecture 1 - Sofus A. Macskassy
Reinforcement Learning
•   Learning a policy: A sequence of outputs
•   No supervised output but delayed reward
•   Credit assignment problem
•   Game playing
•   Robot in a maze
•   Multiple agents, partial observability, ...



Fall 2008               32      CS 567 Lecture 1 - Sofus A. Macskassy
Unsupervised Learning
•   Learning ―what normally happens‖
•   No output
•   Clustering: Grouping similar instances
•   Example applications
     – Customer segmentation in CRM
     – Image compression: Color quantization
     – Bioinformatics: Learning motifs


Fall 2008              33     CS 567 Lecture 1 - Sofus A. Macskassy
Focus for this Course
• Based on information available
     – Supervised – true labels provided
     – Reinforcement – Only indirect labels provided (reward/punishment)
     – Unsupervised – No feedback & no labels
• Based on the role of the learner
     – Passive – given a set of data, produce a model
     – Online – given one data point at a time, update model
     – Active – ask for specific data points to improve model
• Based on type of output
     – Concept Learning – Binary output based on +ve/-ve examples
     – Classification – Classifying into one among many classes
     – Regression – Numeric, ordered output



Fall 2008                          37          CS 567 Lecture 1 - Sofus A. Macskassy

Más contenido relacionado

Similar a Machine Learning (CS 567)

Unit 01 intro to statistics - 1 per page
Unit 01    intro to statistics - 1 per pageUnit 01    intro to statistics - 1 per page
Unit 01 intro to statistics - 1 per pageMAKABANSA
 
Artificial Intelligence by B. Ravikumar
Artificial Intelligence by B. RavikumarArtificial Intelligence by B. Ravikumar
Artificial Intelligence by B. RavikumarGarry D. Lasaga
 
Week1- Introduction.pptx
Week1- Introduction.pptxWeek1- Introduction.pptx
Week1- Introduction.pptxfahmi324663
 
Fcv core sapiro
Fcv core sapiroFcv core sapiro
Fcv core sapirozukun
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Balázs Hidasi
 
lecture_01.pptx - PowerPoint Presentation
lecture_01.pptx - PowerPoint Presentationlecture_01.pptx - PowerPoint Presentation
lecture_01.pptx - PowerPoint Presentationbutest
 
1. Intro DS.pptx
1. Intro DS.pptx1. Intro DS.pptx
1. Intro DS.pptxAnusuya123
 
IT 3010 Lecture 1 Introduction
IT 3010 Lecture 1 IntroductionIT 3010 Lecture 1 Introduction
IT 3010 Lecture 1 IntroductionBabakFarshchian
 
Understanding the Graduate Student Experience
Understanding the Graduate Student ExperienceUnderstanding the Graduate Student Experience
Understanding the Graduate Student ExperienceRebecca Blakiston
 

Similar a Machine Learning (CS 567) (20)

Csc446: Pattren Recognition
Csc446: Pattren RecognitionCsc446: Pattren Recognition
Csc446: Pattren Recognition
 
Csc446: Pattren Recognition (LN1)
Csc446: Pattren Recognition (LN1)Csc446: Pattren Recognition (LN1)
Csc446: Pattren Recognition (LN1)
 
Unit 01 intro to statistics - 1 per page
Unit 01    intro to statistics - 1 per pageUnit 01    intro to statistics - 1 per page
Unit 01 intro to statistics - 1 per page
 
Artificial Intelligence by B. Ravikumar
Artificial Intelligence by B. RavikumarArtificial Intelligence by B. Ravikumar
Artificial Intelligence by B. Ravikumar
 
Democratizing Data Science by Bill Howe
Democratizing Data Science by Bill HoweDemocratizing Data Science by Bill Howe
Democratizing Data Science by Bill Howe
 
lecture_1.pptx
lecture_1.pptxlecture_1.pptx
lecture_1.pptx
 
EURO Conference 2015 - Automated Timetabling
EURO Conference 2015 - Automated TimetablingEURO Conference 2015 - Automated Timetabling
EURO Conference 2015 - Automated Timetabling
 
Week1- Introduction.pptx
Week1- Introduction.pptxWeek1- Introduction.pptx
Week1- Introduction.pptx
 
Lect1 csse81
Lect1 csse81Lect1 csse81
Lect1 csse81
 
lecture1.ppt
lecture1.pptlecture1.ppt
lecture1.ppt
 
lecture1.pptx
lecture1.pptxlecture1.pptx
lecture1.pptx
 
lecture1.pdf
lecture1.pdflecture1.pdf
lecture1.pdf
 
Fcv core sapiro
Fcv core sapiroFcv core sapiro
Fcv core sapiro
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017
 
lecture_01.pptx - PowerPoint Presentation
lecture_01.pptx - PowerPoint Presentationlecture_01.pptx - PowerPoint Presentation
lecture_01.pptx - PowerPoint Presentation
 
1. Intro DS.pptx
1. Intro DS.pptx1. Intro DS.pptx
1. Intro DS.pptx
 
TDT39 oppstartsmøte
TDT39 oppstartsmøteTDT39 oppstartsmøte
TDT39 oppstartsmøte
 
IT 3010 Lecture 1 Introduction
IT 3010 Lecture 1 IntroductionIT 3010 Lecture 1 Introduction
IT 3010 Lecture 1 Introduction
 
ACL 2018 Recap
ACL 2018 RecapACL 2018 Recap
ACL 2018 Recap
 
Understanding the Graduate Student Experience
Understanding the Graduate Student ExperienceUnderstanding the Graduate Student Experience
Understanding the Graduate Student Experience
 

Más de butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

Más de butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

Machine Learning (CS 567)

  • 1. Machine Learning (CS 567) Fall 2008 Time: T-Th, 5:00pm - 6:20pm Location: GFS118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol Han Office hours: TBA Class web page: http://www-scf.usc.edu/~csci567/index.html
  • 2. Course Goals • Introduce the background of machine learning • Show core ML techniques • Show proper experimental ML methodologies • Prepare you to be able to use ML tools, contribute to the field Not: Find out details of the state of the art… Fall 2008 2 CS 567 Lecture 1 - Sofus A. Macskassy
  • 3. Course Overview • Introduction and Basics – Basic problems and questions in Machine Learning. Example applications. • Linear Classifiers – Perceptron, Logistic Regression, Linear Discriminant Analysis (LDA) • Five Popular Algorithms – Decision Trees (C4.5) – Neural Networks (backpropagation) – Probabilistic networks (Naïve bayes) – Nearest Neighbor (k-NN) – Support Vector Machines • Theories of Learning – PAC, Bayesian, Bias-Variance Analysis • Optimizing Test Set Performance – Overfitting, Penalty Methods, Holdout Methods, Ensembles • Evaluation and Methodologies – Statistical tests, comparison of models and algorithms, confidence bounds Fall 2008 3 CS 567 Lecture 1 - Sofus A. Macskassy
  • 4. Administrative issues • Class materials – Introduction to Machine Learning, Ethem Alpaydin, 2004, MIT Press – Class notes are meant to complement readings • Mailing List – Please send me (macskass@usc.edu) mail with the following subject line: cs567 student Fall 2008 4 CS 567 Lecture 1 - Sofus A. Macskassy
  • 5. Administrative issues • Prerequisites – Basic linear algebra and calculus – Some knowledge of probabilities and statistics – Some AI background is recommended but not required – The Weka machine learning toolkit will be used in assignments. Prior knowledge not required Fall 2008 5 CS 567 Lecture 1 - Sofus A. Macskassy
  • 6. Administrative • Please stop me if I talk too fast • Please stop me if you have any questions • Any feedback is better than no feedback – Sooner is better – End of the semester is too late to make it easier on you Fall 2008 6 CS 567 Lecture 1 - Sofus A. Macskassy
  • 7. Evaluation • Weekly quizzes 10% • 5 homework assignments 20% • Midterm 20% • Final Exam (not cumulative) 20% • Class Project 30% – Groups of 1-4 people – Research paper to be written – Presentations at the last two classes – Details later in the course Fall 2008 7 CS 567 Lecture 1 - Sofus A. Macskassy
  • 8. Evaluation • Grades • Based on absolute scores •A 90.0% • A- 87.5% • B+ 85.0% •B 80.0% • B- 77.5% • C+ 75.0% •C 70.0% Fall 2007 8 CS 567 Lecture 2 - Sofus A. Macskassy
  • 9. Lecture 1 Outline • The what and why of machine learning • Why study machine learning • Types of machine learning • Supervised learning defined Fall 2008 9 CS 567 Lecture 1 - Sofus A. Macskassy
  • 10. What is learning? • H.Simon: Any process by which a system improves its performance • M.Minsky: Learning is making useful changes in our minds • Michalsky: Learning is constructing or modifying representations of what is being experiences • Valiant: Learning is the process of knowledge acquisition in the absence of explicit programming • Generally: A process by which a computer algorithm finds an approximate solution to a problem Fall 2008 10 CS 567 Lecture 1 - Sofus A. Macskassy
  • 11. What is The Learning Problem? Learning = Improving with experience at some task • Improve over task T, • with respect to performance measure P, • based on experience E. E.g., Learn to play checkers • T : Play checkers • P : % of games won in world tournament • E : opportunity to play against self Fall 2008 11 CS 567 Lecture 1 - Sofus A. Macskassy
  • 12. Compare to Human Learning • Psychologists use ―learning‖ somewhat differently. • Typically the task changes, whereas in machine learning, typically we assume the task is stationary. • Seems like a pretty fundamental difference! Fall 2008 12 CS 567 Lecture 1 - Sofus A. Macskassy
  • 13. Why study machine learning? • Easier to build a learning system than to hand- code a working program. E.g.: – Robot that learns a map of the environment by wandering around it – Programs that learn o play games by playing against themselves or others • Improving on existing programs, e.g. – Instruction scheduling and register allocation in compilers – Combinatorial optimization problems • Discover knowledge and patterns in highly dimensional, complex data Fall 2008 13 CS 567 Lecture 1 - Sofus A. Macskassy
  • 14. Why study machine learning? • Solving tasks that require a system to be adaptive, e.g. – Speech and handwriting recognition – ―Intelligent‖ user interfaces • Understanding animal and human learning – How do we learn language? – How do we recognize faces? • Creating ―real‖ AI! – An expert system—brilliantly designed, engineered and implemented—cannot learn not to repeat its mistakes. Fall 2008 14 CS 567 Lecture 1 - Sofus A. Macskassy
  • 15. Time is right to study machine learning • Recent progress in algorithms and theory • Growing flood of online data • Computational power is available • Growing industry demand Fall 2008 15 CS 567 Lecture 1 - Sofus A. Macskassy
  • 16. Three example ML tasks • Data mining: using historical data to improve decisions – medical records  medical knowledge • Software applications we can't program by hand – autonomous driving – speech recognition • Self customizing programs – Newsreader that learns user interests Fall 2008 16 CS 567 Lecture 1 - Sofus A. Macskassy
  • 17. Typical Learning Task Given: • 9714 patient records, each describing a pregnancy and birth • Each patient record contains 215 features Learn to predict: • Classes of future patients at high risk for Emergency Cesarean Section Fall 2008 17 CS 567 Lecture 1 - Sofus A. Macskassy
  • 18. Patient Data Patient103 time=1 Patient103 time=2 Patient103 time=n Age: 23 Age: 23 … Age: 23 FirstPregnancy: no FirstPregnancy: no FirstPregnancy: no Anemia: no Anemia: no Anemia: no Diabetes: no Diabetes: YES Diabetes: no PreviousPrematureBirth: no PreviousPrematureBirth: no PreviousPrematureBirth: no Ultrasound: ? Ultrasound: abnormal Ultrasound: ? Elective C-Section: ? Elective C-Section: no Elective C-Section: no Emergency C-Section: ? Emergency C-Section: ? Emergency C-Section: YES … … … Fall 2008 18 CS 567 Lecture 1 - Sofus A. Macskassy
  • 19. Datamining Result One of 18 learned rules: If No previous vaginal delivery, and Abnormal 2nd Trimester Ultrasound, and Malpresentation at admission Then Probability of Emergency C-Section is 0.6 • Accuracy in training data: 26/41 = .63, • Accuracy in test data: 12/20 = .60 Fall 2008 19 CS 567 Lecture 1 - Sofus A. Macskassy
  • 20. Too Difficult to Program Problems too difficult to program by hand ALVINN [Pomerleau] drives 70mph on highways Fall 2008 20 CS 567 Lecture 1 - Sofus A. Macskassy
  • 21. Software Customizes to User http://www.wisewire.com April 30, 1998 Lycos Acquires WiseWire By internetnews.com Staff Lycos, Inc. today announced the acquisition of targeted-content provider WiseWire Corp. for around $39.75 million in Lycos shares. Under the acquisition, the three-year old, Pittsburgh-based WiseWire's automated directory will be incorporated into Lycos’ search page. Lycos said it will be the first online service using the technology which is based on user input and an intelligent agent product. Fall 2008 21 CS 567 Lecture 1 - Sofus A. Macskassy
  • 22. Where Is This Headed? Today: tip of the iceberg • First-generation algorithms: neural nets, decision trees, regression ... • Applied to well-formatted database • Budding industry Fall 2008 22 CS 567 Lecture 1 - Sofus A. Macskassy
  • 23. Looking to Tomorrow Opportunity for tomorrow: enormous impact • Learn across full mixed-media data • Learn across multiple internal databases, plus the web and newsfeeds • Learn by active experimentation • Learn decisions rather than predictions • Cumulative, lifelong learning • Programming languages with learning embedded? Fall 2008 23 CS 567 Lecture 1 - Sofus A. Macskassy
  • 24. Relevant Disciplines • Artificial intelligence • Cognitive Science • Probability theory and statistics • Computational complexity theory • Control theory • Information theory • Philosophy • Psychology and neurobiology • … Fall 2008 24 CS 567 Lecture 1 - Sofus A. Macskassy
  • 25. Important Application Areas • Bioinformatics: sequence alignment, analyzing micro-array data, information integration, … • Computer vision: object recognition, tracking, segmentation, active vision, … • Robotics: state estimation, map building, decision making • Graphics: building realistic simulations • Speech: recognition, speaker identification • Financial analysis: option pricing, portfolio allocation, … • E-commerce: automated trading agents, data mining, spam, … • Medicine: diagnosis, treatment, drug design, … • Computer games: building adaptive opponents • Multimedia: retrieval across diverse databases • Web: information extraction • Industry: record linkage, social network analysis, anomaly detection, tracking entities, … Fall 2008 25 CS 567 Lecture 1 - Sofus A. Macskassy
  • 26. Kinds of Learning • Based on information available – Supervised – true labels provided – Reinforcement – Only indirect labels provided (reward/punishment) – Unsupervised – No feedback & no labels • Based on the role of the learner – Passive – given a set of data, produce a model – Online – given one data point at a time, update model – Active – ask for specific data points to improve model • Based on type of output – Concept Learning – Binary output based on +ve/-ve examples – Classification – Classifying into one among many classes – Regression – Numeric, ordered output Fall 2008 26 CS 567 Lecture 1 - Sofus A. Macskassy
  • 27. Type of Output: Classification • Example: Credit scoring • Differentiating between low-risk and high-risk customers from their income and savings Discriminant: IF income > θ1 AND savings > θ2 THEN low-risk ELSE high-risk Fall 2008 27 CS 567 Lecture 1 - Sofus A. Macskassy
  • 28. Classification: Applications • Aka Pattern recognition • Face recognition: Pose, lighting, occlusion (glasses, beard), make-up, hair style • Character recognition: Different handwriting styles. • Speech recognition: Temporal dependency. – Use of a dictionary or the syntax of the language. – Sensor fusion: Combine multiple modalities; eg, visual (lip image) and acoustic for speech • Medical diagnosis: From symptoms to illnesses • ... Fall 2008 28 CS 567 Lecture 1 - Sofus A. Macskassy
  • 29. Type of Output: Regression • Example: Price of a used car • x : car attributes y = wx+w0 y : price y = g (x | θ ) g ( ) model, θ parameters Fall 2008 29 CS 567 Lecture 1 - Sofus A. Macskassy
  • 30. Regression Applications • Navigating a car: Angle of the steering wheel (CMU NavLab) • Kinematics of a robot arm (x,y) α1= g1(x,y) α2 α2= g2(x,y) α1 • Response surface design Fall 2008 30 CS 567 Lecture 1 - Sofus A. Macskassy
  • 31. Supervised Learning: Uses • Prediction of future cases: Use the rule to predict the output for future inputs • Knowledge extraction: The rule is easy to understand • Compression: The rule is simpler than the data it explains • Outlier detection: Exceptions that are not covered by the rule, e.g., fraud Fall 2008 31 CS 567 Lecture 1 - Sofus A. Macskassy
  • 32. Reinforcement Learning • Learning a policy: A sequence of outputs • No supervised output but delayed reward • Credit assignment problem • Game playing • Robot in a maze • Multiple agents, partial observability, ... Fall 2008 32 CS 567 Lecture 1 - Sofus A. Macskassy
  • 33. Unsupervised Learning • Learning ―what normally happens‖ • No output • Clustering: Grouping similar instances • Example applications – Customer segmentation in CRM – Image compression: Color quantization – Bioinformatics: Learning motifs Fall 2008 33 CS 567 Lecture 1 - Sofus A. Macskassy
  • 34. Focus for this Course • Based on information available – Supervised – true labels provided – Reinforcement – Only indirect labels provided (reward/punishment) – Unsupervised – No feedback & no labels • Based on the role of the learner – Passive – given a set of data, produce a model – Online – given one data point at a time, update model – Active – ask for specific data points to improve model • Based on type of output – Concept Learning – Binary output based on +ve/-ve examples – Classification – Classifying into one among many classes – Regression – Numeric, ordered output Fall 2008 37 CS 567 Lecture 1 - Sofus A. Macskassy