This document discusses support vector machines (SVM) and provides an example of using SVM for classification. It begins with common applications of SVM like face detection and image classification. It then provides an overview of SVM, explaining how it finds the optimal separating hyperplane between two classes by maximizing the margin between them. An example demonstrates SVM by classifying people as male or female based on height and weight data. It also discusses how kernels can be used to handle non-linearly separable data. The document concludes by showing an implementation of SVM on a zoos dataset to classify animals as crocodiles or alligators.
8. Agenda
Why Support Vector Machine?
What is Support Vector Machine?
Understanding Support Vector Machine
Advantages of Support Vector Machine
What is Machine learning?
Use Case in Python
9. What is Machine learning?
Machine Learning
Supervised
Learning
Unsupervised
Learning
Reinforcement
Learning
10. What is Machine learning?
strawberry
Model is
trained
Teach the
model
Got it!
Machine learning model learns from the past input data and makes future prediction
as output
Supervised Learning
11. What is Machine learning?
Machine Learning
Supervised
Learning
Unsupervised
Learning
Reinforcement
Learning
Classification
Regression
} Support Vector
Machine
12. Why Support Vector Machine?
Fruit Shop
?
Last week, my son and I
visited a fruit shop
13. Why Support Vector Machine?
Fruit Shop
?
?
Dad, is that an apple
or a strawberry?
There he found a
fruit
which was similar
to both
14. Why Support Vector Machine?
Fruit Shop
?
After a couple of seconds,
he could figure out that it
was a strawberry
It is a strawberry!
15. </>…
Why Support Vector Machine?
Model Training PredictionPast Labeled
Data
?
New Data
Output
Why not build a model
which can predict an unknown
data??
It’s a
strawberry
16. </>…
Why Support Vector Machine?
Model Training PredictionPast Labeled
Data
?
New Data
Output
This is Support Vector Machine
It’s a
strawberry
17. </>…
Why Support Vector Machine?
Model Training PredictionPast Labeled
Data
?
New Data
Output
SVM is a supervised learning
method that looks at data and
sorts it into one of the two
categories
It’s a
strawberry
18. </>…
Why Support Vector Machine?
Model Training PredictionPast Labeled
Data
?
New Data
Output
But how does prediction
work?
It’s a
strawberry
19. Why Support Vector Machine?
</>…
Model Training Prediction
Past Labeled
Data
?
New Data
Output
It’s a
strawberry
Predicting
the
unknown
SVM
New data of fruit
(Unlabeled Sample)
Support Vector
Machine
Output
Labeled sample
data
Decision
boundary
Plotting the new
data
20. What is Support Vector Machine?
Example
We are given a set of people
with different
• Height
21. What is Support Vector Machine?
Example
We are given a set of people
with different
• Height and
• Weight
22. What is Support Vector Machine?
Example Sample data set
Height Weight
174 65
174 88
175 75
180 65
185 80
Female
We are given a set of people
with different
• Height and
• Weight
23. What is Support Vector Machine?
Example Sample data set
Male
Height Weight
179 90
180 80
183 80
187 85
182 72
We are given a set of people
with different
• Height and
• Weight
24. What is Support Vector Machine?
Height
Weight
Let’s add a new data
point and figure out if
it’s a male or a
female?
25. What is Support Vector Machine?
Height
Weight
Let’s add a new data
point and figure out if
it’s a male or a
female?
A new data
point
26. What is Support Vector Machine?
Height
Weight
Sure.. For this task, we need to
split our data first
27. What is Support Vector Machine?
Height
Weight
We can split our data by
choosing any of these lines
Line 1 Line 2
28. What is Support Vector Machine?
Height
Weight
But to predict the gender of a
new data point we should split
the data in the best possible
way
Line 1 Line 2
29. What is Support Vector Machine?
Height
Weight
Then I would say, this line
best splits the data
Why do you say
it’s the best
split??
Line 1
30. What is Support Vector Machine?
Height
Weight
This line has the maximum
space that separates the
two classes
Why do you say
it’s the best
split??
Maximum space
Line 1
31. Height
Weight
What is Support Vector Machine?
While the other line
doesn’t has the maximum
space that separates the
two classes
Why do you say
it’s the best
split??
Line 1 Line 2
32. What is Support Vector Machine?
Height
Weight
Well yes.. This is
the best split!
That is why this
line best splits the
data
Line 1
33. What is Support Vector Machine?
Height
Weight
Now, Let me add some
technical terms to this
34. What is Support Vector Machine?
Height
Weight
We can also say that the
distance between the points
and the line should be far as
possible
Points
Line
Distance
margin
35. What is Support Vector Machine?
Height
Weight
In technical terms we can say,
the distance between the
support vector and the
hyperplane should be far as
possible
Support
vectors
Hyperplane
36. What is Support Vector Machine?
Height
Weight
Where support vectors are the
extreme points in the datasets
Support
vectors
Hyperplane
37. What is Support Vector Machine?
Height
Weight
And a hyperplane has the
maximum distance to the
support vectors of any class
Support
vectors
Hyperplane
38. What is Support Vector Machine?
Height
Weight
Here, D+ is the shortest
distance to the closest positive
point
D+
Hyperplane
39. What is Support Vector Machine?
Height
Weight
And D- is the shortest distance
to the closest negative point
D+
Hyperplane
D-
40. What is Support Vector Machine?
Height
Weight
Sum of D+ and D- is called
the distance margin
D+
D-
Distance
Margin
41. What is Support Vector Machine?
Height
Weight
From the distance margin, we
get an optimal hyperplane
D+
D-
Distance
Margin
42. What is Support Vector Machine?
Height
Weight
Based on the hyperplane,
we can say the new data point
belongs to male gender
D+
D-
Distance
Margin
43. What is Support Vector Machine?
Height
Weight
Based on the distance margin,
we can say the new data point
belongs to male gender
D+
D-
Distance
Margin
That was so clear!
44. What is Support Vector Machine?
Height
Weight
D+
D-
Distance
Margin
But what happens if a
hyperplane is not
optimal?
45. What is Support Vector Machine?
Height
Weight
If we select a hyperplane having low
margin then there is high chance of
misclassification
D+
D-
Distance
Margin
But what happens if a
hyperplane is not
optimal?
46. What is Support Vector Machine?
Height
Weight
What we discussed so far, is also
called as LSVM
D+
D-
Distance
Margin
But what happens if a
hyperplane is not
optimal?
59. Advantages of Support Vector Machine
High dimensional
input space
Sparse document
vectors
Regularization
parameter
SVM
60. Use case – Problem Statement
Dad, is that a group of
Crocodiles or Alligators?
Well that’s hard to
differentiate
ZOO
Exit
61. Use case – Problem Statement
Dad, is that a group of
Crocodiles or
Alligators?
Well that’s hard to
differentiate
Difference
• Crocodiles are
larger in size
• Crocodiles have
narrow snout
• Alligators are smaller
in size
• Alligators have wider
snout
Size
Snout
Width
62. Use case – Problem Statement
Dad, is that a group of
Crocodiles or Alligators?
Let Support Vector
Machine segregate the two
groups
63. Use case - Implementation
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
from sklearn.datasets.samples_generator import make_blobs
# we create 40 separable points
X, y = make_blobs(n_samples=40, centers=2, random_state=20)
# fit the model, don't regularize for illustration purposes
clf = svm.SVC(kernel='linear', C=1000)
clf.fit(X, y)
plt.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=plt.cm.Paired)
64. Use case - Implementation
# output
Width of the snout
Size of the
body
65. Use case - Implementation
# fit the model, don't regularize for illustration purposes
clf = svm.SVC(kernel='linear', C=1000)
clf.fit(X, y)
plt.scatter(X[:, 0], X[:, 1], c=y, s=30,cmap=plt.cm.Paired)
plt.show
# plot the decision function
ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = clf.decision_function(xy).reshape(XX.shape)
66. Use case - Implementation
# plot decision boundary and margins
ax.contour(XX, YY, Z, colors='k', levels=[-1, 0, 1], alpha=0.5,
linestyles=['--', '-', '--'])
# plot support vectors
ax.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1],
s=100,
linewidth=1, facecolors='none')
plt.show()
67. Use case - Implementation
# output
Width of the snout
Size of the
body
68. Use case - Implementation
# output
Width of the snout
Size of the
body
Conclusion
Congratulations!
We have demonstrated
Support vector machine by
segregating the two classes
Where the blue data points
represents crocodiles and the
brown data points represents
alligators
The hands on example will help
you to encounter any Support
Vector Machine project in future.