1. Machine Learning an Research
Overview
AICTE SPONSORED 2 Weeks FDP on
Artificial Intelligence and Advanced Machine
Learning using Data Science
Venue: S A Engg College Date:26.11.2019
Dr.A.Kathirvel,
Professor and Head
Misrimal Navajee Munoth Jain Engineering College, Chennai
2. What is machine learning?
Learning system model
Training and testing
Performance
Algorithms
Machine learning structure
What are we seeking?
Learning techniques
Applications
Conclusion
Outline & Content
3. 3
Why “Learn”?
Machine learning is programming computers to
optimize a performance criterion using example
data or past experience.
There is no need to “learn” to calculate payroll
Learning is used when:
Human expertise does not exist (navigating on Mars),
Humans are unable to explain their expertise (speech
recognition)
Solution changes in time (routing on a computer
network)
Solution needs to be adapted to particular cases (user
biometrics)
4. 4
What & WhenWe Talk About “Learning”
Learning general models from a data of particular
examples
Data is cheap and abundant (data warehouses, data
marts); knowledge is expensive and scarce.
Example in retail: Customer transactions to
consumer behavior:
People who bought “DaVinci Code” also bought “The Five
PeopleYou Meet in Heaven” (www.amazon.com)
Build a model that is a good and useful approximation
to the data.
5. 5
Data Mining/KDD
Retail: Market basket analysis, Customer relationship
management (CRM)
Finance: Credit scoring, fraud detection
Manufacturing: Optimization, troubleshooting
Medicine: Medical diagnosis
Telecommunications: Quality of service optimization
Bioinformatics: Motifs, alignment
Web mining: Search engines
Definition := “KDD is the non-trivial process of identifying valid,
novel, potentially useful, and ultimately understandable patterns in
data” (Fayyad)
Applications:
6. 6
What is Machine Learning?
Machine Learning
Study of algorithms that
improve their performance
at some task
with experience
Optimize a performance criterion using example
data or past experience.
Role of Statistics: Inference from a sample
Role of Computer science: Efficient algorithms to
Solve the optimization problem
Representing and evaluating the model for inference
7. A branch of artificial intelligence, concerned with
the design and development of algorithms that allow
computers to evolve behaviors based on empirical
data.
As intelligence requires knowledge, it is necessary for
the computers to acquire knowledge.
What is machine learning?
7
8. What is Machine Learning?
It is very hard to write programs that solve problems like
recognizing a face.
We don’t know what program to write because we
don’t know how our brain does it.
Even if we had a good idea about how to do it, the
program might be horrendously complicated.
Instead of writing a program by hand, we collect lots of
examples that specify the correct output for a given input.
A machine learning algorithm then takes these examples
and produces a program that does the job.
The program produced by the learning algorithm may
look very different from a typical hand-written program.
It may contain millions of numbers.
If we do it right, the program works for new cases as
well as the ones we trained it on.
8
9. Machine Learning is…
Machine learning, a branch of artificial intelligence,
concerns the construction and study of systems
that can learn from data.
9
10. Machine Learning is…
Machine learning is programming computers to optimize a
performance criterion using example data or past experience.
-- Ethem Alpaydin
The goal of machine learning is to develop methods that can
automatically detect patterns in data, and then to use the uncovered
patterns to predict future data or other outcomes of interest.
-- Kevin P. Murphy
The field of pattern recognition is concerned with the automatic
discovery of regularities in data through the use of computer
algorithms and with the use of these regularities to take actions.
-- Christopher M. Bishop
10
12. Machine Learning is…
Machine learning is about predicting the future based on
the past.
-- Hal Daume III
Training
Data
model/
predictor
past
model/
predictor
future
Testing
Data
12
14. Training and testing
Training set
(observed)
Universal set
(unobserved)
Testing set
(unobserved)
Data acquisition Practical usage
14
15. Training is the process of making the system able to learn.
No free lunch rule:
Training set and testing set come from the same distribution
Need to make some assumptions or bias
Training and testing
15
16. There are several factors affecting the performance:
Types of training provided
The form and extent of any initial background
knowledge
The type of feedback provided
The learning algorithms used
Two important factors:
Modeling
Optimization
Performance
16
17. The success of machine learning system also
depends on the algorithms.
The algorithms control the search to find and
build the knowledge structures.
The learning algorithms should extract useful
information from training examples.
Algorithms
18. Supervised learning ( )
Prediction
Classification (discrete labels), Regression (real values)
Unsupervised learning ( )
Clustering
Probability distribution estimation
Finding association (in features)
Dimension reduction
Semi-supervised learning
Reinforcement learning
Decision making (robot, chess machine)
Algorithms
18
28. Types of learning task
Supervised learning
Learn to predict output when given an input vector
Who provides the correct answer?
Reinforcement learning
Learn action to maximize payoff
Not much information in a payoff signal
Payoff is often delayed
Reinforcement learning is an important area that will not be
covered in this course.
Unsupervised learning
Create an internal representation of the input e.g. form clusters;
extract features
How do we know if a representation is good?
This is the new frontier of machine learning because most big
datasets do not come with labels.
28
30. Classification Applications
Face recognition
Character recognition
Spam detection
Medical diagnosis: From symptoms to illnesses
Biometrics: Recognition/authentication using physical
and/or behavioral characteristics: Face, iris, signature, etc
...
30
38. Unsupervised learning applications
❑learn clusters/groups without any label
❑customer segmentation (i.e. grouping)
❑image compression
❑bioinformatics: learn motifs
❑…
39. Reinforcement learning
left, right, straight, left, left, left, straight
left, straight, straight, left, right, straight, straight
GOOD
BAD
left, right, straight, left, left, left, straight
left, straight, straight, left, right, straight, straight
18.5
-3
Given a sequence of examples/states and a reward after
completing that sequence, learn to predict the action to take
in for an individual example/state
40. Reinforcement learning example
… WIN!
… LOSE!
Backgammon
Given sequences of moves and whether or not the
player won at the end, learn to make good moves
40
42. Other learning variations
What data is available:
Supervised, unsupervised, reinforcement learning
semi-supervised, active learning, …
How are we getting the data:
online vs. offline learning
Type of model:
generative vs. discriminative
parametric vs. non-parametric
42
45. Supervised: Low E-out or maximize probabilistic terms
Unsupervised: Minimum quantization error, Minimum distance,
MAP, MLE(maximum likelihood estimation)
What are we seeking?
E-in: for training set
E-out: for testing set
45
51. Support vector machine (SVM):
Linear to nonlinear: Feature transform and kernel function
Learning techniques
• Non-linear case
51
52. Unsupervised learning categories and techniques
Clustering
K-means clustering
Spectral clustering
Density Estimation
Gaussian mixture model (GMM)
Graphical models
Dimensionality reduction
Principal component analysis (PCA)
Factor analysis
Learning techniques
52
53. Face detection
Object detection and recognition
Image segmentation
Multimedia event detection
Economical and commercial usage
Applications
53
54. A classic example of a task that requires machine learning:
It is very hard to say what makes a 2
54
55. Some more examples of tasks that are best solved by
using a learning algorithm
Recognizing patterns:
Facial identities or facial expressions
Handwritten or spoken words
Medical images
Generating patterns:
Generating images or motion sequences (demo)
Recognizing anomalies:
Unusual sequences of credit card transactions
Unusual patterns of sensor readings in a nuclear power plant or
unusual sound in your car engine.
Prediction:
Future stock prices or currency exchange rates
55
56. Some web-based examples of machine learning
The web contains a lot of data.Tasks with very big datasets often
use machine learning
especially if the data is noisy or non-stationary.
Spam filtering, fraud detection:
The enemy adapts so we must adapt too.
Recommendation systems:
Lots of noisy data. Million dollar prize!
Information retrieval:
Find documents or images with similar content.
DataVisualization:
Display a huge database in a revealing way (demo)
56
57. Displaying the structure of a set of documents
using Latent Semantic Analysis (a form of PCA)
Each document is converted to a
vector of word counts.This
vector is then mapped to two
coordinates and displayed as a
colored dot.The colors
represent the hand-labeled
classes.
When the documents are laid
out in 2-D, the classes are not
used. So we can judge how good
the algorithm is by seeing if the
classes are separated.
57
59. Machine Learning & Symbolic AI
Knowledge Representation works with facts/assertions and
develops rules of logical inference. The rules can handle
quantifiers. Learning and uncertainty are usually ignored.
Expert Systems used logical rules or conditional probabilities
provided by “experts” for specific domains.
Graphical Models treat uncertainty properly and allow learning
(but they often ignore quantifiers and use a fixed set of variables)
Set of logical assertions → values of a subset of the variables
and local models of the probabilistic interactions between
variables.
Logical inference → probability distributions over subsets of
the unobserved variables (or individual ones)
Learning = refining the local models of the interactions.
59
60. Machine Learning & Statistics
A lot of machine learning is just a rediscovery of things that
statisticians already knew. This is often disguised by differences
in terminology:
Ridge regression = weight-decay
Fitting = learning
Held-out data = test data
But the emphasis is very different:
A good piece of statistics: Clever proof that a relatively
simple estimation procedure is asymptotically unbiased.
A good piece of machine learning: Demonstration that a
complicated algorithm produces impressive results on a
specific task.
Data-mining: Using very simple machine learning techniques
on very large databases because computers are too slow to
do anything more interesting with ten billion examples.
60
61. A spectrum of machine learning tasks
Low-dimensional data (e.g. less
than 100 dimensions)
Lots of noise in the data
There is not much structure in
the data, and what structure there
is, can be represented by a fairly
simple model.
The main problem is
distinguishing true structure from
noise.
High-dimensional data (e.g. more
than 100 dimensions)
The noise is not sufficient to
obscure the structure in the data
if we process it right.
There is a huge amount of
structure in the data, but the
structure is too complicated to
be represented by a simple
model.
The main problem is figuring out
a way to represent the
complicated structure that allows
it to be learned.
Statistics------------------------------------Artificial Intelligence
62. So What Is Machine Learning?
Automating automation
Getting computers to program themselves
Writing software is the bottleneck
Let the data do the work instead!
62
64. Magic?
No, more like gardening
Seeds = Algorithms
Nutrients = Data
Gardener =You
Plants = Programs
64
65. Sample Applications
Web search
Computational biology
Finance
E-commerce
Space exploration
Robotics
Information extraction
Social networks
Debugging
[Your favorite area]
65
66. Growth of Machine Learning
Machine learning is preferred approach to
Speech recognition, Natural language processing
Computer vision
Medical outcomes analysis
Robot control
Computational biology
This trend is accelerating
Improved machine learning algorithms
Improved data capture, networking, faster computers
Software too complex to write by hand
New sensors / IO devices
Demand for self-customization to user, environment
It turns out to be difficult to extract knowledge from human experts→failure of
expert systems in the 1980’s.
66
68. Learning Associations
Basket analysis:
P (Y | X ) probability that somebody who buys X
also buys Y where X and Y are products/services.
Example: P ( chips | beer ) = 0.7
Market-Basket transactions
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
68
69. 69
Classification
Example: Credit
scoring
Differentiating
between low-risk
and high-risk
customers from
their income and
savings
Discriminant: IF income > θ1 AND savings > θ2
THEN low-risk ELSE high-risk
Model
70. 70
Classification:Applications
Aka Pattern recognition
Face recognition: Pose, lighting, occlusion (glasses, beard),
make-up, hair style
Character recognition: Different handwriting styles.
Speech recognition:Temporal dependency.
Use of a dictionary or the syntax of the language.
Sensor fusion: Combine multiple modalities; eg, visual
(lip image) and acoustic for speech
Medical diagnosis: From symptoms to illnesses
Web Advertizing: Predict if a user clicks on an ad on the
Internet.
74. 74
Supervised Learning: Uses
Prediction of future cases: Use the rule to
predict the output for future inputs
Knowledge extraction:The rule is easy to
understand
Compression:The rule is simpler than the
data it explains
Outlier detection: Exceptions that are not
covered by the rule, e.g., fraud
Example: decision trees tools that create rules
75. 75
Unsupervised Learning
Learning “what normally happens”
No output
Clustering: Grouping similar instances
Other applications: Summarization,Association
Analysis
Example applications
Customer segmentation in CRM
Image compression: Color quantization
Bioinformatics: Learning motifs
76. 76
Reinforcement Learning
Topics:
Policies: what actions should an agent take in a particular
situation
Utility estimation: how good is a state (→used by policy)
No supervised output but delayed reward
Credit assignment problem (what was responsible for the
outcome)
Applications:
Game playing
Robot in a maze
Multiple agents, partial observability, ...
78. 78
Resources: Journals
Journal of Machine Learning Research
www.jmlr.org
Machine Learning
IEEETransactions on Neural Networks
IEEETransactions on Pattern Analysis and Machine
Intelligence
Annals of Statistics
Journal of the American Statistical Association
...
79. 79
Resources: Conferences
International Conference on Machine Learning (ICML)
European Conference on Machine Learning (ECML)
Neural Information Processing Systems (NIPS)
Computational Learning
International Joint Conference on Artificial Intelligence
(IJCAI)
ACM SIGKDD Conference on Knowledge Discovery and
Data Mining (KDD)
IEEE Int. Conf. on Data Mining (ICDM)
80. ML in a Nutshell
Tens of thousands of machine learning
algorithms
Hundreds new every year
Every machine learning algorithm has three
components:
Representation
Evaluation
Optimization
80
81. Representation
Decision trees
Sets of rules / Logic programs
Instances
Graphical models (Bayes/Markov nets)
Neural networks
Support vector machines
Model ensembles
Etc.
81
82. Evaluation
Accuracy
Precision and recall
Squared error
Likelihood
Posterior probability
Cost / Utility
Margin
Entropy
K-L divergence
Etc.
82
84. We have a simple overview of some
techniques and algorithms in machine learning.
Furthermore, there are more and more
techniques apply machine learning as a solution.
In the future, machine learning will play an
important role in our daily life.
Conclusion
84
85. [1] W. L. Chao, J. J. Ding, “Integrated Machine
Learning Algorithms for Human Age Estimation”,
NTU, 2011.
Reference
85