2. Course Prerequisites
• Basic understanding of probability,
linear algebra, and computational
algorithms
• Basic facility in programming in
Python
2
3. Course Description
This online course is aimed at developing
practical machine learning and data science skills.
The course will cover the theoretical basics of a
broad range of machine learning concepts and
methods with practical applications to sample
datasets via programming assignments.
3
4. Learning Outcomes
• Describe the principal models used in machine learning
and the types of problems to which they are typically
applied.
• Compare the assumption made in each model and the
strengths and weaknesses of each model.
• Determine to which problems machine learning is
applicable and which model or models would be most
appropriate in each case.
• Apply the principal models in machine learning to
appropriate problems
4
5. Course
Structure
This module is conducted entirely online, which
means you do not have to be on campus to
complete any portion of it. You will participate in
the course using the university V-class platform.
Lectures will become available at midnight each
week and will be delivered using the V-class
system or Google Meeting. In addition to lectures,
participation will play a key role in this course.
MBD1202 - Victoria University 5
6. Grading Breakdown
Course work:
֍ Continuous assessment tests 40%
֍Group and individual project (course work) 25%
֍A Final Exam worth 35%
MBD1202 - Victoria University 6
Please note. I do not accept late assignments. You get zero for anything past the due date unless you have a
documented medical excuse.
7. Course Overview
• Introduction to the course and to machine
learning
• Fundamentals of machine learning
• Supervised learning
• Unsupervised learning
7
8. Readings
8
The required texts for the course are
• Introduction to Machine Learning, Third Edition, Ethem Alpaydin,
MIT Press, 2014
• Python Machine Learning: Machine Learning and Deep Learning
with Python, scikit learn, and TensorFlow 2, Third Edition, Sebastian
Raschka and Vahid Mirjalili, Packt Publishing, 2020.
• T. astie, R. Tibshirani, J. H. Friedman, “The Elements of Statistical
Learning”, Springer(2nd ed.), 2009
10. What Is Machine
Learning?
• ….. is the science (and art) of programming computers
so they can learn from data.
• ….. is the field of study that gives computers the
ability to learn without being explicitly programmed
(Arthur Samuel, 1959)
10
11. The field of machine learning is concerned with
the question of how to construct computer
programs that automatically improve with
experience.
11
12. Learning is….
… a computational process for improving
performance based on experience
12
13. Machine Learning
All of these concepts are interrelated, and have become buzzwords
over the past several years
MBD1202 - Victoria University 13
15. Categories of Machine Learning
There are three broad categories of machine learning:
Supervised learning, unsupervised learning, and
reinforcement learning. Note that we will primarily focus
on supervised learning in this class, which is the most
developed" branch of machine learning. While we will
also cover various unsupervised learning algorithms,
reinforcement learning will be out of the scope of this
class.
15
16. Supervised Learning
Supervised learning is the subcategory of machine learning that focuses
on learning a classification or regression model, that is, learning from
labelled training data (i.e., inputs that also contain the desired outputs
or targets.
MBD1202 - Victoria University 16
17. Unsupervised learning
In contrast to supervised learning, unsupervised learning is a branch of
machine learning that is concerned with unlabeled data. Common tasks
in unsupervised learning are clustering analysis (assigning group
memberships) and dimensionality reduction (compressing data onto a
lower-dimensional subspace or manifold).
MBD1202 - Victoria University 17
18. Unsupervised learning
MBD1202 - Victoria University 18
Illustration of clustering, where the dashed lines indicate potential group membership assignments of
unlabeled data points.
19. Reinforcement learning
Reinforcement learning is when a machine learning algorithm learns
what to do – how to match situations to actions with a goal of
maximizing a numerical reward (Sutton and Barto, 2018).
MBD1202 - Victoria University 19
21. Introduction to Supervised Learning
Supervised learning refers to a machine learning sub-category used to
train computational models using labelled data to predict or classify
certain outcomes.
MBD1202 - Victoria University 21
22. Supervised Examples: Advertising
In order to motivate our study of statistical learning, we begin with a simple
example. Suppose that you are a machine learning consultant hired by a
client to provide advice on how to improve sales of a particular product.
Advertising data: Sales and expenditure (000's) on TV, Radio, and Newspaper
advertising for 200 markets.
MBD1202 - Victoria University 22
23. Supervised Examples: Advertising
Can we predict Sales (output variable) based on advertising
expenditure (input variables/predictors)?
MBD1202 - Victoria University 23
24. Supervised Examples: Advertising
MBD1202 - Victoria University 24
sns.regplot(df['TV'],df['sales'], ci=None) sns.regplot(df[‘radio'],df['sales'], ci=None) sns.regplot(df[‘newspaper'],df['sales'], ci=None)
25. Variables: Measurement Scales
• Both predictor and output variables can be defined as either qualitative or
quantitative.
• Supervised learning problems where the output variables are quantitative
in nature are referred to as regression tasks.
• Problems, where the output variables are qualitative in nature, are
referred to as classification tasks.
• The particular methods one chooses to perform these tasks are often
informed by the measurement scale of the output, i.e., qualitative vs.
quantitative.
• The set of predictor variables is usually a combination of both quantitative
and qualitative measurements for both regression and classification
problems. Indeed, one often spends some time encoding/engineering the
feature set in order to improve model performance or interpretation.
MBD1202 - Victoria University 25
26. Variables: Notation
• The input variables are typically denoted using the variable output
variable symbol X, with a subscript to distinguish them. So 𝑋1might be
the TV budget, 𝑋2 the radio budget, and 𝑋3 the newspaper budget.
• The output variable—in this case, sales—is often called the response
or dependent variable, and is typically denoted using the symbol Y.
• {𝑦1, 𝑦2,…..,𝑦𝑁} thus represents a set of N observations.
• For a given Y we also observe a vector of p features 𝑋 =
(𝑋1, 𝑋2, … . . , 𝑋𝑝).
MBD1202 - Victoria University 26
27. Variables: Notation
• We may refer to an observed outcome of this feature vector using
𝑥 = (𝑥1, 𝑥2, … . . , 𝑥𝑝).
• So for each observation, we have a pair (y, x).
• For example, for our first observation of the advertising data, we have
y = 22.1 and x = (230.1, 37.8, 69.2).
MBD1202 - Victoria University 27
28. “Torture the data, and it will confess to anything.” – Ronald Coase,
Economics, Nobel Prize Laureate
MBD1202 - Victoria University 28
29. Training a Model
Building supervised learning machine learning models has three stages:
1. Training: The algorithm will be provided with historical input data
with the mapped output. The algorithm will learn the patterns
within the input data for each output and represent that as a
statistical equation, which is also commonly known as a model.
MBD1202 - Victoria University 29
30. Training a Model
2. Validation: In this phase, the performance of the trained model is
evaluated, usually by applying it to a dataset (that was not used as
part of the training) to predict the class or event.
MBD1202 - Victoria University 30
31. Training a Model
3. Prediction: Here we apply the trained model to a data set that was
not part of either the training or testing. The prediction will be used
to drive business decisions.
Note: Before training a model, the data set should be split into training,
validation, and testing sets.
MBD1202 - Victoria University 31
33. Regression
MBD1202 - Victoria University 33
Regression analysis includes several
variations, such as linear (Simple),
multiple linear, and nonlinear. The most
common models are simple linear and
multiple linear. Nonlinear regression
analysis is commonly used for more
complicated data sets in which the
dependent and independent variables
show a nonlinear relationship.
34. Regression Analysis – Simple Linear Regression
One way to describe the relationship between response and the predictor is via the
equation:
𝒀 = 𝜷𝟎 + 𝜷𝟏𝑿 + 𝝐
Where:
Y – Dependent variable
X – Independent variable
𝛽0– Intercept
𝛽1– Slope
ϵ – Residual (error)
MBD1202 - Victoria University 34
35. Fitting a Slope
MBD1202 - Victoria University 35
Let’s try to fit a slope line
through all the points such
that the error or residual, that
is, the distance of line from
each point is the best possible
minimal.
36. How Good Is Your Model?
There are three metrics widely used for evaluating linear model
performance.
• R-squared
• Root Mean Squared Error (RMSE)
• Mean Absolute Error (MAE)
MBD1202 - Victoria University 36
37. Evaluation - R-Squared
The R-squared metric is the most popular practice of evaluating how
well your model fits the data. R-squared value designates the total
proportion of variance in the dependent variable explained by the
independent variable. It is a value between 0 and 1; the value toward 1
indicates a better model fit.
MBD1202 - Victoria University 37
39. R-Squared - Calculated
𝑅 − 𝑠𝑞𝑢𝑎𝑟𝑒𝑑 =
𝑇𝑜𝑡𝑎𝑙 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒 𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙 (𝛴 𝑆𝑆𝑅)
𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒 𝑇𝑜𝑡𝑎𝑙(𝛴 𝑆𝑆𝑇)
MBD1202 - Victoria University 39
𝑅 − 𝑠𝑞𝑢𝑎𝑟𝑒𝑑 =
1510.01
1547.55
= 0.97
In this case, R-squared can be interpreted as 97% of the variability in the
dependent variable (test score) can be explained by the independent
variable (hours studied).
40. Evaluation - RMSE
This is the square root of the mean of the squared errors. RMSE
indicates how close the predicted values are to the actual values; hence
a lower RMSE value signifies that the model performance is good. One
of the key properties of RMSE is that the unit will be the same as the
target variable.
1
𝑛
𝑖=1
𝑛
(𝑦𝑖 − 𝑦𝑖)2
MBD1202 - Victoria University 40
41. Evaluation - MAE
This is the mean or average of the absolute value of the errors, that is,
the predicted - actual.
1
𝑛
𝑖=1
𝑛
𝑦𝑖 − 𝑦𝑖
MBD1202 - Victoria University 41