2. Machine Learning: Definition
Machine learning, a branch of artificial intelligence, concerns
the construction and study of systems that can learn from
data.
Definition: A computer program is said to learn from
experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as
measured by P, improves with experience E.
For example, a machine learning system could be trained on
email messages to learn to distinguish between spam and
non-spam messages. After learning, it can then be used to
classify new email messages into spam and non-spam folders.
3. Why is Machine Learning Important?
Some tasks cannot be defined well, except by examples
(e.g., recognizing people).
Relationships and correlations can be hidden within large
amounts of data. Machine Learning/Data Mining may be
able to find these relationships.
Human designers often produce machines that do not work
as well as desired in the environments in which they are used.
4. Why is Machine Learning Important
(Cont’d)?
The amount of knowledge available about certain tasks
might be too large for explicit encoding by humans (e.g.,
medical diagnostic).
Environments change over time.
New knowledge about tasks is constantly being discovered
by humans. It may be difficult to continuously re-design
systems “by hand”.
5. Areas of Influence for Machine
Learning
Statistics: How best to use samples drawn from unknown
probability distributions to help decide from which distribution
some new sample is drawn?
Brain Models: Non-linear elements with weighted inputs
(Artificial Neural Networks) have been suggested as simple
models of biological neurons.
Adaptive Control Theory: How to deal with controlling a
process having unknown parameters that must be estimated
during operation?
6. Areas of Influence for Machine
Learning (Cont’d)
Psychology: How to model human performance on various
learning tasks?
Artificial Intelligence: How to write algorithms to acquire the
knowledge humans are able to acquire, at least, as well as
humans?
Evolutionary Models: How to model certain aspects of
biological evolution to improve the performance of computer
programs?
7. Designing a Learning System:
An Example
o Problem Description
o Choosing the Training Experience
o Choosing the Target Function
o Choosing a Representation for the Target Function
o Choosing a Function Approximation Algorithm
o Final Design
8. Problem Description:
A Checker Learning Problem
Task T: Playing Checkers
Performance Measure P: Percent of games won against
opponents
Training Experience E: To be selected ==> Games Played
against itself
9. Issues in Machine Learning
What algorithms are available for learning a concept? How
well do they perform?
How much training data is sufficient to learn a concept with
high confidence?
When is it useful to use prior knowledge?
Are some training examples more useful than others?
What are best tasks for a system to learn?
What is the best way for a system to represent its knowledge?
10. Machine Learning Algorithm Types
Machine learning algorithms can be organized into a taxonomy based on
the desired outcome of the algorithm or the type of input available during
training the machine.
Supervised learning algorithms are trained on labelled examples, i.e.,
input where the desired output is known. The supervised learning
algorithm attempts to generalise a function or mapping from inputs to
outputs which can then be used to speculatively generate an output
for previously unseen inputs.
Unsupervised learning algorithms operate on unlabelled examples, i.e.,
input where the desired output is unknown. Here the objective is to
discover structure in the data (e.g. through a cluster analysis), not to
generalise a mapping from inputs to outputs.
Semi-supervised learning combines both labelled and unlabelled
examples to generate an appropriate function or classifier.
11. Machine Learning Algorithm Types
(Cont’d)
Reinforcement learning is concerned with how intelligent
agents ought to act in an environment to maximise some notion of
reward. The agent executes actions which cause the observable
state of the environment to change. Through a sequence of
actions, the agent attempts to gather knowledge about how the
environment responds to its actions, and attempts to synthesise a
sequence of actions that maximises a cumulative reward.
Developmental learning, elaborated for Robot learning, generates
its own sequences (also called curriculum) of learning situations to
cumulatively acquire repertoires of novel skills through autonomous
self-exploration and social interaction with human teachers, and
using guidance mechanisms such as active learning, maturation,
motor synergies, and imitation.
12. AdaBoost Algorithm
AdaBoost, short for Adaptive Boosting, is a machine
learning algorithm, formulated by Yoav Freund and Robert
Schapire.
It is a meta-algorithm, and can be used in conjunction with many
other learning algorithms to improve their performance.
AdaBoost is adaptive in the sense that subsequent classifiers built
are tweaked in favour of those instances misclassified by previous
classifiers.
AdaBoost is sensitive to noisy data and outliers.
13. AdaBoost - Adaptive Boosting
Instead of resampling, uses training set re-weighting
Each training sample uses a weight to determine the
probability of being selected for a training set.
AdaBoost is an algorithm for constructing a “strong” classifier
as linear combination of “simple” “weak” classifier
Final classification based on weighted vote of weak classifiers
14. AdaBoost Terminology
… “weak” or basis classifier
(Classifier = Learner = Hypothesis)
… “strong” or final classifier
Weak Classifier: < 50% error over any distribution
Strong Classifier: Thresholded linear combination of weak
classifier outputs
15. AdaBoost : The Algorithm
The framework
The learner receives examples xi , yi i 1 N chosen randomly
according to some fixed but unknown distribution P on X Y
The learner finds a hypothesis which is consistent with most of the
for most 1 i N
samples h f xi yi
The algorithm
Input variables
P: The distribution where the training examples sampling from
D: The distribution over all the training samples
WeakLearn: A weak learning algorithm to be boosted
T: The specified number of iterations
17. Advantages of AdaBoost
Very simple to implement
Feature selection on very large sets of features
AdaBoost adjusts adaptively the errors of the weak
hypotheses by WeakLearn.