2. Course Description
This course provides a grounding in ML techniques and methods &
research skills.
Introduction to Machine Learning .
Machine Learning Paradigms.
Machine Learning Algorithms.
Iterative Algorithms:
GDA (Batch vs Stochastic).
GAA
Non Iterative (Algebric) Algorithms.
Parametric vs non parametric Regression Algorithms.
Classification & Newton Algorithms.
Clustering and K-means.
K-nearest neighbour (k-NN)
Programming Tools Matlab (Labs)/ Python
3. References:
Lecture notes “machine learning” Stanford University. Prof. Andrew Ng
Machine Learning by Tom M. Mitchell, Publisher: McGraw-Hill
Science/Engineering/Math, 1997.
“An Introduction to Artificial Intelligence”
by Janet Finlay and Alan Dix.
INTRODUCTION TO MACHINE LEARNING, AN EARLY DRAFT OF
A PROPOSED TEXTBOOK, Nils J. Nilsson, Robotics Laboratory
Department of Computer Science, Stanford University Stanford, CA 94305
November 3, 1998, Copyright c 2005 Nils J. Nilsson.
"Artificial Intelligence : A Guide to Intelligent Systems", Second
Edition(2005) by Michael Negnevitsky.
Introduction to Machine Learning by Alex Smola and S.V.N.
Vishwanathan, Departments of Statistics and Computer Science, Purdue
University and College of Engineering and Computer Science, Australian
National University, 2008.
4. Introduction to Machine Learning Second Edition, by Ethem Alpaydın, The MIT
Press Cambridge, 2010.
A Course in Machine Learning by Hal Daumé III, 2012.
Introduction to k Nearest Neighbour Classification and Condensed Nearest
Neighbour Data Reduction by Oliver Sutton, February, 2012.
Understanding Machine Learning From Theory to Algorithms by Shai Shalev-
Shwartz and Shai Ben-David , Published 2014 by Cambridge University Press.
Machine Learning in Action, PETER HARRINGTON, MANNING Shelter Island.
K-means Clustering & k-NN classification by Andreas C. Kapourani (Credit:
Hiroshi Shimodaira) 03 February 2016 Learning and Data Lab 4, Informatics 2B.
Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G.
Barto, A Bradford Book, The MIT Press Cambridge, Massachusettsm, London,
England,2017.
Dataset for classification:
https://github.com/Starignus/AppliedML_Python_Coursera/blob/master/fruit_data_
with_colors.txt.
6. What is Machine Learning?
Why are you taking this course?
What topics would you like to see
covered?
7. Machine Learning is…
Machine learning, a branch of artificial intelligence, concerns
the construction and study of systems that can learn from
data.
8. Machine Learning is…
Machine learning is programming computers to optimize a performance
criterion using example data or past experience.
-- Ethem Alpaydin
The goal of machine learning is to develop methods that can
automatically detect patterns in data, and then to use the uncovered
patterns to predict future data or other outcomes of interest.
-- Kevin P. Murphy
The field of pattern recognition is concerned with the automatic
discovery of regularities in data through the use of computer algorithms
and with the use of these regularities to take actions.
-- Christopher M. Bishop
9. Machine Learning is…
Machine learning is about predicting the future based on the past.
-- Hal Daume III
It is the field of study that gives computers the ability to learn without
being explicitly programmed.
--Arthur Samuel
10. Machine learning definition (cont.)
Tom Mitchell defined well posed learning as: a computer program is
said to learn from experience E with respect to some task T and some
performance measure P, if its performance on T as measured by P
improves with experience E.
For example in the case of checkers or chess game, the experience E
that the program has would be the experience of playing a lot of
games of checkers against it selves.
Task T is the task of playing checkers and performance P would be
another faction of games that wins against certain human opponent,
by this definition we could say that Samuel has been able to make
checker programs able to play checkers.
11. Machine Learning is…
Machine learning is about predicting the future based on the past.
-- Hal Daume III
Training
Data
model/
predictor
past
model/
predictor
future
Testing
Data
12. Machine Learning,
data mining: machine learning applied to “databases”, i.e.
collections of data
inference and/or estimation in statistics
pattern recognition in engineering
signal processing in electrical engineering
optimization
13. What is Machine Learning?
It is very hard to write programs that solve problems like
recognizing a face.
We don’t know what program to write because we don’t know
how our brain does it.
Even if we had a good idea about how to do it, the program
might be complicated.
Instead of writing a program by hand, we collect lots of examples
that specify the correct output for a given input.
A machine learning algorithm then takes these examples and
produces a program that does the job.
The program produced by the learning algorithm may look very
different from a typical hand-written program.
If we do it right, the program works for new cases as well as the
ones we trained it on.
14. A classic example of a task that requires machine
learning: It is very hard to say what makes a 2
15. Some more examples of tasks that are best
solved by using a learning algorithm
Recognizing patterns:
Facial identities or facial expressions
Handwritten or spoken words
Medical images
Generating patterns:
Generating images or motion sequences (demo)
Recognizing anomalies:
Unusual sequences of credit card transactions
Unusual patterns of sensor readings in a nuclear
power plant or unusual sound in your car engine.
Prediction:
Future stock prices or currency exchange rates
16. Some web-based examples of machine learning
The web contains a lot of data. Tasks with very big
datasets often use machine learning
o especially if the data is noisy or non-stationary.
Spam filtering, fraud detection:
o The enemy adapts so we must adapt too.
Recommendation systems:
o Lots of noisy data. Information retrieval:
o Find documents or images with similar content.
Data Visualization:
o Display a huge database in a revealing way (demo)
17. Why “Learn”?
There is no need to “learn” to calculate payroll
Learning is used when:
Human expertise does not exist (navigating on Mars),
Humans are unable to explain their expertise (speech
recognition)
Solution changes in time (routing on a computer
network)
Solution needs to be adapted to particular cases (user
biometrics)
18. How can we program systems to automatically learn and to improve
with experience?
Why machine learning?
Need to make machines think and learn from mistakes like human.
To notice similarities between things and so generate new ideas.
Attempt to work out why things went wrong (explanation).
19. Difficulties
The most difficult problem in building expert machines is capturing
the knowledge from experts.
Things that are normally implicit in expert’s head must be
externalized and made explicit.
It’s hard for expert to say what are the rules they use to assess a
situation, they only say what factors they take into account.
On Contrary, machine learning program can take description of the
situation in terms of these factors then infer rules that match expert’s
behavior.
Expert then criticize these rules and verify if rules are wrong , expert
suggest examples that can guide further learning.
20. Example: How to program a machine that
learns how to filter spam e-mails.
The machine will simply memorize all previous e-mails
that had been labeled as spam e-mails by the human user.
When a new e-mail arrives, the machine will search for it
in the set of previous spam e-mails.
If it matches one of them, it will be trashed. Otherwise, it
will be moved to the user's inbox folder.
21. While the preceding “learning by memorization" approach is
sometimes useful, it lacks an important aspect of learning systems
“The ability to label unseen e-mail messages”.
A successful learner should be able to progress from individual
examples to broader generalization.
This is also referred to as inductive reasoning or inductive
inference.
22. Generalization
To achieve generalization in the spam filtering task, the
learner can scan the previously seen e-mails, and extract
a set of words whose appearance in an e-mail message is
indicative of spam.
Then, when a new e-mail arrives, the machine can check
whether one of the suspicious words appears in it, and
predict its label accordingly.
Such a system would potentially be able correctly to
predict the label of unseen e-mails.
23. Active versus Passive Learners
Learning paradigms can vary by the role played by the learner.
An active learner interacts with the environment at training
time, say, by posing queries or performing experiments.
While a passive learner only observes the information
provided by the environment (or the teacher) without
influencing or directing it.
Learner of a spam filter is usually passive (waiting for users to
mark the e-mails coming to them).
In an active setting, one could imagine asking users to label
specific e-mails chosen by the learner, or even composed by the
learner, to enhance its understanding of what spam is.
24. Machine learning is a subfield of artificial intelligence that is concerned with the
design and development of algorithms and techniques that allow computers to
"learn".
In general, there are two types of learning: inductive and deductive.
Inductive machine learning methods extract rules and patterns out of massive
data sets.
The major focus of machine learning research is to extract information from data
automatically, by computational and statistical methods.
Hence, machine learning is closely related not only to data mining and statistics,
but also to theoretical computer science.
Machine learning refers to a system capable of the autonomous acquisition and
integration of knowledge.
This capacity to learn from experience, analytical observation, and other means,
results in a system that can continuously self-improve and thereby offer
increased efficiency and effectiveness.
25. Learning to predict which medical patients will respond to which
treatments, by analyzing experience captured in databases of online
medical records.
Study mobile robots that learn how to successfully navigate based on
experience they gather from sensors as they roam their environment.
Examples
Computer aids for scientific discovery that combine initial scientific
hypotheses with new experimental data to automatically produce
refined scientific hypotheses that better fit observed data.
26. Computer aids for scientific discovery that combine initial
scientific hypotheses with new experimental data to
automatically produce refined scientific hypotheses that
better fit observed data.
27. Growth of Machine Learning
Machine learning is preferred approach to
Speech recognition, Natural language processing
Computer vision
Medical outcomes analysis, Classifying DNA sequences
Robot control , Detecting credit card fraud,
Stock market analysis, Speech and handwriting recognition,
This trend is accelerating
Improved machine learning algorithms
Improved data capture, networking, faster computers
Software too complex to write by hand
New sensors / IO devices
Demand for self-customization to user, environment
It turns out to be difficult to extract knowledge from human expertsfailure of expert
systems in the 1980’s.
29. 1. Supervised learning
a. Regression
Example:
Learning to predict houses’ prices
Suppose you collect a dataset of houses’ prices in a certain
geographical area.
Suppose you collect statistics about how much houses cost
according to the square footage (feet2) of the house.
x
x
$
x x
x
x
feet2
Cost will be
If my house is
here
x
x
x
x
31. The reason for calling this a supervised problem is that
we provide the algorithm a dataset of a punch of houses’
sizes and actual prizes.
We simply supervise the algorithm, we give the algorithm
the quit right answer for the prices and we want the
algorithm to learn the association between the i/ps and
o/ps so it gives us more about the right answers.
This was an example of what is called a REGRESSION
problem.
The term regression reverse the fact that the o/p you are
trying to predict is a continuous value of the price.