2. Outline
• Introduction
• Examples
• What does it mean to learn?
• Supervised and Unsupervised Learning
• Types of Learning
• Classification Problem
• Text Mining Example
• Conclusions (and further reading)
2
4. What is Machine Learning?
• A branch of artificial
intelligence (AI)
• Arthur Samuel (1959)
Field of study that gives
computers the ability to
learn without being explicitly
programmed
From: Andrew NG – Standford Machine Learning Classes
http://www.youtube.com/watch?v=UzxYlbK2c7E
4 09-11-2011
5. What is Machine Learning?
• Tom Mitchell (1998) Well-posed Learning
Problem:
A computer program is said to learn from
experience E with respect to some class of
tasks T and performance measure P, if its
performance at tasks in T, as measured by P,
improves with experience E.
• Mark Dredze
Teaching a computer about the world
5 09-11-2011
6. What is Machine Learning?
• Goal:
Design and development of algorithms that allow
computers to evolve behaviors based on
empirical data, such as from sensor data or
databases
• How to apply machine Learning?
• Observe the world
• Develop models that match observations
• Teach computer to learn these models
• Computer applies learned model to the world
6 09-11-2011
7. Example 1:
Prediction of House Price
From: Andrew NG – Standford Machine Learning Classes
http://www.youtube.com/watch?v=UzxYlbK2c7E
7 09-11-2011
8. Example 2:
Learning to automatically classify text documents
From: http://www.xcellerateit.com/
8 09-11-2011
9. Example 3:
Face Detection and Tracking
http://www.micc.unifi.it/projects/optimal-
face-detection-and-tracking/
9 09-11-2011
10. Example 4:
Social Network Mining
Users’
Profile
Friendship
Group &
Network
U3
U1 U5
Hidden Information ?
U2 U4
From: Exploit of Online Social Networks with Community-Based
Graph Semi-Supervised Learning, Mingzhen Mo and Irwin King Group
ICONIP 2010, Sydney, Australia Network
10 09-11-2011
13. What does it mean to learn?
• Learn patterns in data
z Decision ẋ
System
z : observed signal
ẋ Estimated output
13 09-11-2011
14. Unsupervised Learning
• Look for patterns in data
• No training Data (no examples of output)
• Pro:
• No labeling of examples for output
• Con:
• Cannot demonstrate specific types of output
• Applications:
• Data mining
• Finds interesting patterns in data
From: Mark Dredze
Machine Learning - Finding Patterns in the World
14 09-11-2011
15. Supervised Learning
• Learn patterns to simulate given output
• Pro:
• Can learn complex patterns
• Good performance
• Con:
• Requires many examples of output for examples
• Applications:
• Classification
• Sorts data into predefined groups
From: Mark Dredze
Machine Learning - Finding Patterns in the World
15 09-11-2011
16. Types of Learning: Output
• Classification
• Binary, multi‐class, multi‐label, hierarchical, etc.
• Classify email as spam
• Loss: accuracy
• Ranking
• Order examples by preference
• Rank results of web search
• Loss: Swapped pairs
• Regression
• Real‐valued output
• Predict the price of tomorrow’s stock price
• Loss: Squared loss
• Structured prediction
• Sequences, trees, segmentation
• Find faces in an image
• Loss: Precision/Recall of faces
From: Mark Dredze
16 09-11-2011 Machine Learning - Finding Patterns in the World
17. Classification Problem
• Classical Architecture
z Feature y ẋ
Classification
Extraction
z : observed signal
y : feature vector (pattern) y S
ẋ Estimated output (class) ẋ {1,2,…,c}
17 09-11-2011
18. Classification Problem
• Example with 1 feature
• Problem: classify people in non-obese or obese by
observation of its weight (only 1 feature)
• Is it possible to classify without without making any
mistakes?
18
18
19. Classification Problem
• Example with 2 features
z Feature y = {weight, ẋ = non-obese
Classification
Extraction Height} or obese
z : observed signal
y : feature vector (pattern) y S
ẋ Estimated output (class) ẋ {1: non-obese, 2: obese}
19 09-11-2011
20. Classification Problem
• Example with 2 feature
• Problem: classify people in non-obese or obese by
observation of its weight and height
• Now the decision appears more simple!
20
20
21. Classification Problem
• Example with 2 feature
• Problem: classify people in non-obese or obese by
observation of its weight and height
• Regiões de decisão: R1 : non-obese; R2 : obese
21
21
22. Classification Problem
• Decision Regions
• Goal of the classifier: define a partition of the feature space with
c disjoint regions, called decision regions: : R1, R2, …, Rc
22
22
24. Text Mining Process
Adapted from: Introduction to Text Mining,
Yair Even-Zohar, University of Illinois
24 09-11-2011
25. Text Mining Process
• Text preprocessing
• Syntactic/Semantic text
analysis
• Features Generation
• Bag of words
• Features Selection
• Simple counting
• Statistics
• Text/Data Mining
• Classification- Supervised
learning
• Clustering- Unsupervised
learning
• Analyzing results
25 09-11-2011
26. Syntactic / Semantic text analysis
• Part Of Speech (pos) tagging
• Find the corresponding pos for each word
e.g., John (noun) gave (verb) the (det) ball (noun)
• Word sense disambiguation
• Context based or proximity based
• Parsing
• Generates a parse tree (graph) for each sentence
• Each sentence is a stand alone graph
26 09-11-2011
27. Feature Generation: Bag of words
• Text document is represented by the words it
contains (and their occurrences)
• e.g., “Lord of the rings” {“the”, “Lord”, “rings”, “of”}
• Highly efficient
• Makes learning far simpler and easier
• Order of words is not that important for certain applications
• Stemming: identifies a word by its root
• e.g., flying, flew fly
• Reduce dimensionality
• Stop words: The most common words are unlikely
to help text mining
• e.g., “the”, “a”, “an”, “you” …
27 09-11-2011
28. Example
Hi,
Here is your weekly update (that unfortunately hasn't gone
out in about a month). Not much action here right now.
1) Due to the unwavering insistence of a member of the
group, the ncsa.d2k.modules.core.datatype package is month).
hi, weekly update (that unfortunately gone out
now completely independent of now. d2k application.
much action here right the 1) due unwavering insistence
2) Transformations are now handled differently in Tables. package
member group, ncsa.d2k.modules.core.datatype
Previously, transformations were done using a
now completely independent d2k application. 2)
TransformationModule. That handled could thentables. previously,
transformations now module differently be added
to a list that an ExampleTable kept. transformationmodule. module
transformations done using Now, there is an
interfaceadded list exampletable kept. sub-interface called
called Transformation and a now, interface called
ReversibleTransformation. unfortunate go out month much action here
hi week update
transformation sub-interface called
right now 1 due unwaver insistence member group ncsa
reversibletransformation.
d2k modules core datatype package now complete
independence d2k application 2 transformation now handle
different table previous transformation do use
transformationmodule module add list exampletable keep
now interface call transformation sub-interface call
reversibletransformation
28 09-11-2011
29. Feature Generation: Weighting
• Term Frequency
Bag of Words
Lorem 1
term ti, document dj
dolor 1
Praesent 1
• Inverse Document Frequency
Lorem ipsum dolor sit
amet, consectetuer
adipiscing elit. Praesent iaculis 1
et quam sit amet diam
porttitor iaculis.
Vestibulum ante ipsum Vestibulum 1
primis in faucibus orci
luctus et ultrices posuere
ipsum 2
cubilia Curae;
consectetuer 2
• TF-IDF
29 09-11-2011
31. Feature Selection
• Reduce dimensionality
• Learners have difficulty addressing tasks with high
dimensionality
• Irrelevant features
• Not all features help!
• e.g., the existence of a noun in a news
article is unlikely to help classify it as
“politics” or “sport”
• Stop Words Removal
31 09-11-2011
32. Example
hi core
week datatype
update package
unfortunate complete
go independence
out application
month 2 hi do
much transformationweek core
action handle update datatype
here different unfortunate package
right table go complete
now previous out independence
1 use month hi application datatype
due much
transformationmodule transformation
week package
unwaver add action handle
update complete
insistence list here different
unfortunate independence
member exampletable right table
month application
group keep now previous
ncsa interface due action
use transformation
d2k call insistence right
add handle
modules sub-interface member duelist different
do group
reversibletransformation keep
insistence table
ncsa interface
member previous
d2k call
group add
modules sub-interface
ncsa list
d2k interface
modules call
core sub-interface
32 09-11-2011
34. Text Mining: Classification definition
• Given: a collection of labeled records
(training set)
• Each record contains a set of features (attributes), and
the true class (label)
• Find: a model for the class as a function
of the values of the features
• Goal: previously unseen records should be
assigned a class as accurately as possible
• A test set is used to determine the accuracy of the
model. Usually, the given data set is divided into training
and test sets, with training set used to build the model
and test set used to validate it
34 09-11-2011
35. Text Mining: Clustering definition
• Given: a set of documents and a similarity
measure among documents
• Find: clusters such that:
• Documents in one cluster are more similar to one another
• Documents in separate clusters are less similar to one another
• Goal:
• Finding a correct set of documents
35 09-11-2011
36. Supervised vs. Unsupervised Learning
• Supervised learning (classification)
• Supervision: The training data (observations,
measurements, etc.) are accompanied by labels
indicating the class of the observations
• New data is classified based on the training set
• Unsupervised learning (clustering)
• The class labels of training data is unknown
• Given a set of measurements, observations, etc. with the
aim of establishing the existence of classes or clusters in
the data
36 09-11-2011
38. Readings
• Survey Books in Machine Learning
• The Elements of Statistical Learning
• Hastie, Tibshirani, Friedman
• Pattern Recognition and Machine Learning
• Bishop
• Machine Learning
• Mitchell
• Questions?
38 09-11-2011
39. ACKNOWLEDGEMENTS
• ISEL – DEETC
• Final year and MSc supervised students (Tony Tam, ...)
• Students of Digital Signal Processing
• Artur Ferreira
• Instituto Telecomunicações (IT)
David Coutinho, Hugo Silva, Ana Fred, Mário Figueiredo
• Fundação para a Ciência e Tecnologia (FCT)
39 09-11-2011
40. www.it.pt
Thank you for the attention!
André Ribeiro Lourenço
Mail to: alourenco@deetc.isel.ipl.pt
arlourenco@gmail.com
40