SlideShare una empresa de Scribd logo
1 de 79
Descargar para leer sin conexión
INTRO TO MACHINE LEARNING
150
MIN
5.0
DMYTRO FISHMAN
UNIVERSITY OF TARTU
INSTITUTE OF COMPUTER SCIENCE
New York City Taxi
Fare Prediction
https://www.kaggle.com/c/new-york-city-taxi-fare-prediction
x
y
-0.8
0.2
-0.6
-0.4
-0.2
0.0
0.4
0.6
-0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1.00
type in your browser:
tinyurl.com/yxb5k5jl
(save a copy to your drive)
The following slides are inspired by
“An Introduction to Linear Regression Analysis” video
https://youtu.be/zPG4NjIkCjc
y
X
independent variable
dependentvariable
Linear Regression
y
X
independent variable
dependentvariable
Linear Regression
How the change in independent variable
influences dependent variable?
y
X
independent variable
dependentvariable
Positive relationship
Linear Regression
y
X
independent variable
dependentvariable
Negative relationship
Linear Regression
y
X
independent variable
dependentvariable
Linear Regression
In order to build a linear regression
we need observations
y
X
independent variable
dependentvariable
In order to build a linear regression
we need observations
Linear Regression
y
X
independent variable
dependentvariable
Linear Regression
y
X
independent variable
dependentvariable We want to find a line such that …
Linear Regression
y
X
independent variable
dependentvariable We want to find a line such that …
… it minimises the sum of errors
Linear Regression
y
X
independent variable
dependentvariable
actual
estimated
error
We want to find a line such that …
… it minimises the sum of errors
Linear Regression
y
X
independent variable
dependentvariable
arg min =
n
∑
i=1
( − )2yi ̂yi
Regression Line
Least squares method
We want to find a line such that …
… it minimises the sum of errors
Linear Regression
y
X
independent variable
dependentvariable
Linear Regression
y
X
fareamount
distance
Linear Regression
y
X
fareamount
̂y xw0 w1+=
distance
Linear Regression
y
X
fareamount
xw0 w1+=
arg min
,
=
n
∑
i=1
( − )2yi ̂yi
w0 w1
distance
̂y
Linear Regression
minimises the sum of errors with respect to w0 and w1w0 w1
y
X
fareamount
Linear Regression (example)
distance
2
3
4
5
6
1
1 2 3 4 5
x y x - x̄ y - ȳ (x - x̄ )2 (x - x̄ )(y - ȳ)
1 2 -2 -2 4 4
2 4 -1 0 1 0
3 5 0 1 0 0
4 4 1 0 1 0
5 5 2 1 4 2
x̄ = 3 ȳ = 4 10 6
xw0 w1+=̂y
w1
3w0 .6+=4 *
w0 = 2.2
2.2
=
∑ (x − x)(y − y)
∑ (x − x)2
=
6
10
= .6
=
∑ (x − x)(y − y)
∑ (x − x)2
=
6
10
= .6
y
X
fareamount
Linear Regression (example)
distance
2
3
4
5
6
1
1 2 3 4 5
x y x - x̄ y - ȳ (x - x̄ )2 (x - x̄ )(y - ȳ)
1 2 -2 -2 4 4
2 4 -1 0 1 0
3 5 0 1 0 0
4 4 1 0 1 0
5 5 2 1 4 2
x̄ = 3 ȳ = 4 10 6
xw0 w1+=̂y
w1
3w0 .6+=4 *
w0 = 2.2
2.2
Let’s return to our Colabs
Decision Tree Algorithm
By asking a simple question about value of independent
variable it tries to predict a value of dependent variable
Decision Tree Algorithm
By asking a simple question about value of independent
variable it tries to predict a value of dependent variable
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
Decision Tree Algorithm
By asking a simple question about value of independent
variable it tries to predict a value of dependent variable
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
Decision Tree Algorithm
By asking a simple question about value of independent
variable it tries to predict a value of dependent variable
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y
False
Decision Tree Algorithm
By asking a simple question about value of independent
variable it tries to predict a value of dependent variable
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
Decision Tree Algorithm
By asking a simple question about value of independent
variable it tries to predict a value of dependent variable
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
Root node
Decision Tree Algorithm
By asking a simple question about value of independent
variable it tries to predict a value of dependent variable
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
Root node
Left child Right child
Decision Tree Algorithm
By asking a simple question about value of independent
variable it tries to predict a value of dependent variable
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
Root node
Left child Right child
Leafs
Decision Tree Algorithm
By asking a simple question about value of independent
variable it tries to predict a value of dependent variable
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
Decision Tree Algorithm
Here, X may correspond to any vertical line.
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
For example if X = 2.5:
2.5
Decision Tree Algorithm
Here, X may correspond to any vertical line.
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
For example if X = 2.5:
2.5
What are most reasonable values
for Y and Z?
Decision Tree Algorithm
Here, X may correspond to any vertical line.
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
For example if X = 2.5:
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Decision Tree Algorithm
What would be MSE if Y = 4 and Z = 5?
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 4 fare amount = 5
False True
For example if X = 2.5:
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 4
Z = 5
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 4 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 4
Z = 5
yi ̂yiMSE =
1
n
n
∑
i=1
( − )2
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 4 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 4
Z = 5
yi ̂yiMSE =
1
n
n
∑
i=1
( − )2
real value
predicted value
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 4 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 4
Z = 5
MSE =
1
n
n
∑
i=1
( − )2
=
(y1 − ̂y1)2
+ (y2 − ̂y2)2
+ (y3 − ̂y3)2
+ (y4 − ̂y4)2
+ (y5 − ̂y5)2
5
yi ̂yi
1
2
3
4
5
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 4 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 4
Z = 5
MSE =
1
n
n
∑
i=1
( − )2
=
(y1 − ̂y1)2
+ (y2 − ̂y2)2
+ (y3 − ̂y3)2
+ (y4 − ̂y4)2
+ (y5 − ̂y5)2
5
yi ̂yi
1
2
3
4
5
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 4 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 4
Z = 5
MSE =
1
n
n
∑
i=1
( − )2
=
(2)2
+ (0)2
+ (0)2
+ (1)2
+ (0)2
5
yi ̂yi
1
2
3
4
5
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 4 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 4
Z = 5
MSE =
1
n
n
∑
i=1
( − )2
=
4 + 0 + 0 + 1 + 0
5
yi ̂yi
1
2
3
4
5
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 4 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 4
Z = 5
MSE =
1
n
n
∑
i=1
( − )2
=
5
5
yi ̂yi
1
2
3
4
5
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 4 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 4
Z = 5
MSE =
1
n
n
∑
i=1
( − )2
= 1yi ̂yi
1
2
3
4
5
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 4 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 4
Z = 5
MSE =
1
n
n
∑
i=1
( − )2
= 1yi ̂yi
1
2
3
4
5
so, if X = 2.5, Y = 4 and Z = 5, MSE is 1
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 4 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 4
Z = 5
MSE =
1
n
n
∑
i=1
( − )2
= 1yi ̂yi
1
2
3
4
5
Can we find better Y and Z?
so, if X = 2.5, Y = 4 and Z = 5, MSE is 1
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 3 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 3
Z = 5
1
2
3
4
5
MSE =
1
n
n
∑
i=1
( − )2
=
(y1 − ̂y1)2
+ (y2 − ̂y2)2
+ (y3 − ̂y3)2
+ (y4 − ̂y4)2
+ (y5 − ̂y5)2
5
yi ̂yi
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 3 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 3
Z = 5
1
2
3
4
5
MSE =
1
n
n
∑
i=1
( − )2
=
(2 − 3)2
+ (4 − 3)2
+ (5 − 5)2
+ (4 − 5)2
+ (5 − 5)2
5
yi ̂yi
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 3 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 3
Z = 5
1
2
3
4
5
MSE =
1
n
n
∑
i=1
( − )2
=
1 + 1 + 0 + 1 + 0
5
yi ̂yi
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 3 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 3
Z = 5
1
2
3
4
5
MSE =
1
n
n
∑
i=1
( − )2
=
3
5
= 0.6yi ̂yi
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 3 fare amount = 5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 3
Z = 5
1
2
3
4
5
MSE =
1
n
n
∑
i=1
( − )2
=
3
5
= 0.6yi ̂yi
so, if X = 2.5, Y = 3 and Z = 5,
MSE is 0.6
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 3
fare amount =
4.5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 3
Z = 4.66
1
2
3
4
5
MSE =
1
n
n
∑
i=1
( − )2
=
(2 − 3)2
+ (4 − 3)2
+ (5 − 4.66)2
+ (4 − 4.66)2
+ (5 − 4.66)2
5
yi ̂yi
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 3
fare amount =
4.5
False True
2.5
What are most reasonable values for Y
and Z (that minimise total MSE)?
Y = 3
1
2
3
4
5
MSE =
1
n
n
∑
i=1
( − )2
=
1 + 1 + 0.12 + 0.43 + 0.12
5
yi ̂yi
Z = 4.66
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 3
fare amount =
4.5
False True
2.5
Y = 3
1
2
3
4
5
MSE =
1
n
n
∑
i=1
( − )2
=
2.67
5
= 0.53yi ̂yi so, if Y = 3 and Z = 4.5,
MSE is smallest
Are we happy?
Z = 4.66
Decision Tree Algorithm
Is distance > 2.5
fare amount = 3
fare amount =
4.5
False True
Hold on, how did we choose this split on the first place?
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
2.5
1
2
3
4
5
Decision Tree Algorithm
Is distance > 2.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 3
fare amount =
4.5
False True
2.5
1
2
3
4
5
Hold on, how did we choose this split on the first place?
Maybe there are better options?
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
What are the possible split options in this case?
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
What are the possible split options in this case?
0.5 1.5 2.5 3.5 4.5 5.5
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
Are these meaningful?
0.5 5.5
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
How to compare remaining?
1.5 2.5 3.5 4.5
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
How to compare remaining?
For each one we can compute MSE
?? ? ?MSE
1.5 2.5 3.5 4.5
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
How to compare remaining?
For each one we can compute MSE
0.53? ? ?
1.5 2.5 3.5 4.5
MSE
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
?
Y = 2
Z = 4.5
1.5
MSE
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
(0 + 0.25 + 0.25 + 0.25 + 0.25)/5 = 0.2
Y = 2
Z = 4.5
1.5
MSE
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
How to compare remaining?
For each one we can compute MSE
0.2 ? ?
1.5 2.5 3.5 4.5
MSE 0.53
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
?
3.5
MSE
Y = 3.66
Z = 4.5
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
1.03
3.5
MSE
Y = 3.66
Z = 4.5
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
How to compare remaining?
For each one we can compute MSE
0.2 1.03 ?
1.5 2.5 3.5 4.5
MSE 0.53
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
?
4.5
MSE
Y = 3.75
Z = 5
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
0.95
4.5
MSE
Y = 3.75
Z = 5
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
How to compare remaining?
For each one we can compute MSE
0.2 1.03 0.95
1.5 2.5 3.5 4.5
MSE 0.53
Decision Tree Algorithm
Is distance > X
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = Y fare amount = Z
False True
1
2
3
4
5
We choose the split that minimises total MSE
0.2 1.03 0.95
1.5 2.5 3.5 4.5
MSE 0.53
Decision Tree Algorithm
Is distance > 1.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 2
fare amount =
4.5
False True
1
2
3
4
5
Thus, the resulting tree:
0.2
1.5
MSE
Decision Tree Algorithm
Is distance > 1.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
fare amount = 2
fare amount =
4.5
False True
1
2
3
4
5
Can we make our decision tree more accurate?
0.2
1.5
MSE
Decision Tree Algorithm
distance > 1.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
False True
1
2
3
4
5
Can we make our decision tree more accurate?
0.2
1.5
MSE
Yes, by going deeper!
fare amount =
2
distance > X
fare amount =
Y
fare amount =
Z
False True
Decision Tree Algorithm
distance > 1.5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
False True
1
2
3
4
5
Can we make our decision tree more accurate?
0.2
1.5
MSE
Yes, by going deeper!
fare amount =
2
distance > X
fare amount =
Y
fare amount =
Z
False True
Let’s return to our Colabs
Overfitting
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
y
X
fareamount
distance
2
3
4
5
6
1
1 2 3 4 5
Simple, but imperfect Complicated, but ideal
VS
Train/val split
Initial dataset
MSE = 1.0
Train dataset
Randomly
select 60%
MSE = 0.0
Simple, but
imperfect
Complicated,
but ideal
Validation (val) dataset
Randomly
select 40%
MSE = 2.5 MSE = 0.5
POINTS
POINTS
1. MACHINE LEARNING
MODEL IS NOT MAGIC

2. YOU CAN SAVE AND
LOAD ML MODELS

3. EVALUATING MODEL
PERFORMANCE IS
IMPORTANT

4. YOU MAY NEED TO
RETRAIN YOUR
MODELS
THANK YOU

Más contenido relacionado

La actualidad más candente

2014 st josephs geelong spec maths
2014 st josephs geelong spec maths2014 st josephs geelong spec maths
2014 st josephs geelong spec mathsAndrew Smith
 
LLP and Transportation problems solution
LLP and Transportation problems solution LLP and Transportation problems solution
LLP and Transportation problems solution Aditya Arora
 
Approach to anova questions
Approach to anova questionsApproach to anova questions
Approach to anova questionsGeorgeGidudu
 
Solution Manual : Chapter - 01 Functions
Solution Manual : Chapter - 01 FunctionsSolution Manual : Chapter - 01 Functions
Solution Manual : Chapter - 01 FunctionsHareem Aslam
 
ゲーム理論NEXT 線形計画問題第7回 -シンプレックス法2-
ゲーム理論NEXT 線形計画問題第7回 -シンプレックス法2-ゲーム理論NEXT 線形計画問題第7回 -シンプレックス法2-
ゲーム理論NEXT 線形計画問題第7回 -シンプレックス法2-ssusere0a682
 
ゲーム理論BASIC 演習32 -時間決めゲーム:交渉ゲーム-
ゲーム理論BASIC 演習32 -時間決めゲーム:交渉ゲーム-ゲーム理論BASIC 演習32 -時間決めゲーム:交渉ゲーム-
ゲーム理論BASIC 演習32 -時間決めゲーム:交渉ゲーム-ssusere0a682
 
Biostatistics Standard deviation and variance
Biostatistics Standard deviation and varianceBiostatistics Standard deviation and variance
Biostatistics Standard deviation and varianceHARINATHA REDDY ASWARTHA
 
resposta do capitulo 15
resposta do capitulo 15resposta do capitulo 15
resposta do capitulo 15silvio_sas
 
RS Agarwal Quantitative Aptitude - 10 chap
RS Agarwal Quantitative Aptitude - 10 chapRS Agarwal Quantitative Aptitude - 10 chap
RS Agarwal Quantitative Aptitude - 10 chapVinoth Kumar.K
 
ゲーム理論BASIC 演習6 -仁を求める-
ゲーム理論BASIC 演習6 -仁を求める-ゲーム理論BASIC 演習6 -仁を求める-
ゲーム理論BASIC 演習6 -仁を求める-ssusere0a682
 
The sexagesimal foundation of mathematics
The sexagesimal foundation of mathematicsThe sexagesimal foundation of mathematics
The sexagesimal foundation of mathematicsMichielKarskens
 

La actualidad más candente (15)

Tugas blog-matematika
Tugas blog-matematikaTugas blog-matematika
Tugas blog-matematika
 
Appendex
AppendexAppendex
Appendex
 
2014 st josephs geelong spec maths
2014 st josephs geelong spec maths2014 st josephs geelong spec maths
2014 st josephs geelong spec maths
 
LLP and Transportation problems solution
LLP and Transportation problems solution LLP and Transportation problems solution
LLP and Transportation problems solution
 
Approach to anova questions
Approach to anova questionsApproach to anova questions
Approach to anova questions
 
Solution Manual : Chapter - 01 Functions
Solution Manual : Chapter - 01 FunctionsSolution Manual : Chapter - 01 Functions
Solution Manual : Chapter - 01 Functions
 
ゲーム理論NEXT 線形計画問題第7回 -シンプレックス法2-
ゲーム理論NEXT 線形計画問題第7回 -シンプレックス法2-ゲーム理論NEXT 線形計画問題第7回 -シンプレックス法2-
ゲーム理論NEXT 線形計画問題第7回 -シンプレックス法2-
 
ゲーム理論BASIC 演習32 -時間決めゲーム:交渉ゲーム-
ゲーム理論BASIC 演習32 -時間決めゲーム:交渉ゲーム-ゲーム理論BASIC 演習32 -時間決めゲーム:交渉ゲーム-
ゲーム理論BASIC 演習32 -時間決めゲーム:交渉ゲーム-
 
Biostatistics Standard deviation and variance
Biostatistics Standard deviation and varianceBiostatistics Standard deviation and variance
Biostatistics Standard deviation and variance
 
resposta do capitulo 15
resposta do capitulo 15resposta do capitulo 15
resposta do capitulo 15
 
RS Agarwal Quantitative Aptitude - 10 chap
RS Agarwal Quantitative Aptitude - 10 chapRS Agarwal Quantitative Aptitude - 10 chap
RS Agarwal Quantitative Aptitude - 10 chap
 
Inequalities
InequalitiesInequalities
Inequalities
 
Tugas 5.3 kalkulus integral
Tugas 5.3 kalkulus integralTugas 5.3 kalkulus integral
Tugas 5.3 kalkulus integral
 
ゲーム理論BASIC 演習6 -仁を求める-
ゲーム理論BASIC 演習6 -仁を求める-ゲーム理論BASIC 演習6 -仁を求める-
ゲーム理論BASIC 演習6 -仁を求める-
 
The sexagesimal foundation of mathematics
The sexagesimal foundation of mathematicsThe sexagesimal foundation of mathematics
The sexagesimal foundation of mathematics
 

Similar a Introduction to Machine Learning for Taxify/Bolt

Lecture 7.1 to 7.2 bt
Lecture 7.1 to 7.2 btLecture 7.1 to 7.2 bt
Lecture 7.1 to 7.2 btbtmathematics
 
Dirty quant-shortcut-workshop-handout-inequalities-functions-graphs-coordinat...
Dirty quant-shortcut-workshop-handout-inequalities-functions-graphs-coordinat...Dirty quant-shortcut-workshop-handout-inequalities-functions-graphs-coordinat...
Dirty quant-shortcut-workshop-handout-inequalities-functions-graphs-coordinat...Nish Kala Devi
 
Malimu variance and standard deviation
Malimu variance and standard deviationMalimu variance and standard deviation
Malimu variance and standard deviationMiharbi Ignasm
 
2. Fixed Point Iteration.pptx
2. Fixed Point Iteration.pptx2. Fixed Point Iteration.pptx
2. Fixed Point Iteration.pptxsaadhaq6
 
Practice questions and tips in business mathematics
Practice questions and tips in business mathematicsPractice questions and tips in business mathematics
Practice questions and tips in business mathematicsDr. Trilok Kumar Jain
 
Practice questions and tips in business mathematics
Practice questions and tips in business mathematicsPractice questions and tips in business mathematics
Practice questions and tips in business mathematicsDr. Trilok Kumar Jain
 
random variable and distribution
random variable and distributionrandom variable and distribution
random variable and distributionlovemucheca
 
group4-randomvariableanddistribution-151014015655-lva1-app6891 (1).pdf
group4-randomvariableanddistribution-151014015655-lva1-app6891 (1).pdfgroup4-randomvariableanddistribution-151014015655-lva1-app6891 (1).pdf
group4-randomvariableanddistribution-151014015655-lva1-app6891 (1).pdfPedhaBabu
 
group4-randomvariableanddistribution-151014015655-lva1-app6891.pdf
group4-randomvariableanddistribution-151014015655-lva1-app6891.pdfgroup4-randomvariableanddistribution-151014015655-lva1-app6891.pdf
group4-randomvariableanddistribution-151014015655-lva1-app6891.pdfAliceRivera13
 
Probability and Statistics
Probability and StatisticsProbability and Statistics
Probability and Statisticsshekharpatil33
 
Chapter-1-04032021-111422pm (2).pptx
Chapter-1-04032021-111422pm (2).pptxChapter-1-04032021-111422pm (2).pptx
Chapter-1-04032021-111422pm (2).pptxabdulhannan992458
 
Introduction to machine learning algorithms
Introduction to machine learning algorithmsIntroduction to machine learning algorithms
Introduction to machine learning algorithmsbigdata trunk
 

Similar a Introduction to Machine Learning for Taxify/Bolt (20)

Lecture 7.1 to 7.2 bt
Lecture 7.1 to 7.2 btLecture 7.1 to 7.2 bt
Lecture 7.1 to 7.2 bt
 
Dirty quant-shortcut-workshop-handout-inequalities-functions-graphs-coordinat...
Dirty quant-shortcut-workshop-handout-inequalities-functions-graphs-coordinat...Dirty quant-shortcut-workshop-handout-inequalities-functions-graphs-coordinat...
Dirty quant-shortcut-workshop-handout-inequalities-functions-graphs-coordinat...
 
Malimu variance and standard deviation
Malimu variance and standard deviationMalimu variance and standard deviation
Malimu variance and standard deviation
 
2. Fixed Point Iteration.pptx
2. Fixed Point Iteration.pptx2. Fixed Point Iteration.pptx
2. Fixed Point Iteration.pptx
 
Numerical Method for UOG mech stu prd by Abdrehman Ahmed
Numerical Method for UOG mech stu prd by Abdrehman Ahmed Numerical Method for UOG mech stu prd by Abdrehman Ahmed
Numerical Method for UOG mech stu prd by Abdrehman Ahmed
 
Basic algebra for entrepreneurs
Basic algebra for entrepreneurs Basic algebra for entrepreneurs
Basic algebra for entrepreneurs
 
Basic algebra for entrepreneurs
Basic algebra for entrepreneurs Basic algebra for entrepreneurs
Basic algebra for entrepreneurs
 
Practice questions and tips in business mathematics
Practice questions and tips in business mathematicsPractice questions and tips in business mathematics
Practice questions and tips in business mathematics
 
Practice questions and tips in business mathematics
Practice questions and tips in business mathematicsPractice questions and tips in business mathematics
Practice questions and tips in business mathematics
 
PPT SPLTV
PPT SPLTVPPT SPLTV
PPT SPLTV
 
Math quiz general
Math quiz generalMath quiz general
Math quiz general
 
Chapter 04 answers
Chapter 04 answersChapter 04 answers
Chapter 04 answers
 
random variable and distribution
random variable and distributionrandom variable and distribution
random variable and distribution
 
group4-randomvariableanddistribution-151014015655-lva1-app6891 (1).pdf
group4-randomvariableanddistribution-151014015655-lva1-app6891 (1).pdfgroup4-randomvariableanddistribution-151014015655-lva1-app6891 (1).pdf
group4-randomvariableanddistribution-151014015655-lva1-app6891 (1).pdf
 
group4-randomvariableanddistribution-151014015655-lva1-app6891.pdf
group4-randomvariableanddistribution-151014015655-lva1-app6891.pdfgroup4-randomvariableanddistribution-151014015655-lva1-app6891.pdf
group4-randomvariableanddistribution-151014015655-lva1-app6891.pdf
 
Probability and Statistics
Probability and StatisticsProbability and Statistics
Probability and Statistics
 
Chapter-1-04032021-111422pm (2).pptx
Chapter-1-04032021-111422pm (2).pptxChapter-1-04032021-111422pm (2).pptx
Chapter-1-04032021-111422pm (2).pptx
 
04_AJMS_299_21.pdf
04_AJMS_299_21.pdf04_AJMS_299_21.pdf
04_AJMS_299_21.pdf
 
Introduction to machine learning algorithms
Introduction to machine learning algorithmsIntroduction to machine learning algorithms
Introduction to machine learning algorithms
 
2LinearSequences
2LinearSequences2LinearSequences
2LinearSequences
 

Más de Dmytro Fishman

DOME: Recommendations for supervised machine learning validation in biology
DOME: Recommendations for supervised machine learning validation in biologyDOME: Recommendations for supervised machine learning validation in biology
DOME: Recommendations for supervised machine learning validation in biologyDmytro Fishman
 
Tips for effective presentations
Tips for effective presentationsTips for effective presentations
Tips for effective presentationsDmytro Fishman
 
Autonomous Driving Lab - Simultaneous Localization and Mapping WP
Autonomous Driving Lab - Simultaneous Localization and Mapping WPAutonomous Driving Lab - Simultaneous Localization and Mapping WP
Autonomous Driving Lab - Simultaneous Localization and Mapping WPDmytro Fishman
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningDmytro Fishman
 
Introduction to Gaussian Processes
Introduction to Gaussian ProcessesIntroduction to Gaussian Processes
Introduction to Gaussian ProcessesDmytro Fishman
 
Detecting Nuclei from Microscopy Images with Deep Learning
Detecting Nuclei from Microscopy Images with Deep LearningDetecting Nuclei from Microscopy Images with Deep Learning
Detecting Nuclei from Microscopy Images with Deep LearningDmytro Fishman
 
Deep Learning in Healthcare
Deep Learning in HealthcareDeep Learning in Healthcare
Deep Learning in HealthcareDmytro Fishman
 
5 Introduction to neural networks
5 Introduction to neural networks5 Introduction to neural networks
5 Introduction to neural networksDmytro Fishman
 
4 Dimensionality reduction (PCA & t-SNE)
4 Dimensionality reduction (PCA & t-SNE)4 Dimensionality reduction (PCA & t-SNE)
4 Dimensionality reduction (PCA & t-SNE)Dmytro Fishman
 
3 Unsupervised learning
3 Unsupervised learning3 Unsupervised learning
3 Unsupervised learningDmytro Fishman
 
What does it mean to be a bioinformatician?
What does it mean to be a bioinformatician?What does it mean to be a bioinformatician?
What does it mean to be a bioinformatician?Dmytro Fishman
 
Machine Learning in Bioinformatics
Machine Learning in BioinformaticsMachine Learning in Bioinformatics
Machine Learning in BioinformaticsDmytro Fishman
 

Más de Dmytro Fishman (14)

DOME: Recommendations for supervised machine learning validation in biology
DOME: Recommendations for supervised machine learning validation in biologyDOME: Recommendations for supervised machine learning validation in biology
DOME: Recommendations for supervised machine learning validation in biology
 
Tips for effective presentations
Tips for effective presentationsTips for effective presentations
Tips for effective presentations
 
Autonomous Driving Lab - Simultaneous Localization and Mapping WP
Autonomous Driving Lab - Simultaneous Localization and Mapping WPAutonomous Driving Lab - Simultaneous Localization and Mapping WP
Autonomous Driving Lab - Simultaneous Localization and Mapping WP
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Introduction to Gaussian Processes
Introduction to Gaussian ProcessesIntroduction to Gaussian Processes
Introduction to Gaussian Processes
 
Biit group 2018
Biit group 2018Biit group 2018
Biit group 2018
 
Detecting Nuclei from Microscopy Images with Deep Learning
Detecting Nuclei from Microscopy Images with Deep LearningDetecting Nuclei from Microscopy Images with Deep Learning
Detecting Nuclei from Microscopy Images with Deep Learning
 
Deep Learning in Healthcare
Deep Learning in HealthcareDeep Learning in Healthcare
Deep Learning in Healthcare
 
5 Introduction to neural networks
5 Introduction to neural networks5 Introduction to neural networks
5 Introduction to neural networks
 
4 Dimensionality reduction (PCA & t-SNE)
4 Dimensionality reduction (PCA & t-SNE)4 Dimensionality reduction (PCA & t-SNE)
4 Dimensionality reduction (PCA & t-SNE)
 
3 Unsupervised learning
3 Unsupervised learning3 Unsupervised learning
3 Unsupervised learning
 
1 Supervised learning
1 Supervised learning1 Supervised learning
1 Supervised learning
 
What does it mean to be a bioinformatician?
What does it mean to be a bioinformatician?What does it mean to be a bioinformatician?
What does it mean to be a bioinformatician?
 
Machine Learning in Bioinformatics
Machine Learning in BioinformaticsMachine Learning in Bioinformatics
Machine Learning in Bioinformatics
 

Último

Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 

Último (20)

Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 

Introduction to Machine Learning for Taxify/Bolt

  • 1. INTRO TO MACHINE LEARNING 150 MIN 5.0 DMYTRO FISHMAN UNIVERSITY OF TARTU INSTITUTE OF COMPUTER SCIENCE
  • 2. New York City Taxi Fare Prediction https://www.kaggle.com/c/new-york-city-taxi-fare-prediction
  • 3. x y -0.8 0.2 -0.6 -0.4 -0.2 0.0 0.4 0.6 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1.00 type in your browser: tinyurl.com/yxb5k5jl (save a copy to your drive)
  • 4. The following slides are inspired by “An Introduction to Linear Regression Analysis” video https://youtu.be/zPG4NjIkCjc
  • 6. y X independent variable dependentvariable Linear Regression How the change in independent variable influences dependent variable?
  • 9. y X independent variable dependentvariable Linear Regression In order to build a linear regression we need observations
  • 10. y X independent variable dependentvariable In order to build a linear regression we need observations Linear Regression
  • 12. y X independent variable dependentvariable We want to find a line such that … Linear Regression
  • 13. y X independent variable dependentvariable We want to find a line such that … … it minimises the sum of errors Linear Regression
  • 14. y X independent variable dependentvariable actual estimated error We want to find a line such that … … it minimises the sum of errors Linear Regression
  • 15. y X independent variable dependentvariable arg min = n ∑ i=1 ( − )2yi ̂yi Regression Line Least squares method We want to find a line such that … … it minimises the sum of errors Linear Regression
  • 19. y X fareamount xw0 w1+= arg min , = n ∑ i=1 ( − )2yi ̂yi w0 w1 distance ̂y Linear Regression minimises the sum of errors with respect to w0 and w1w0 w1
  • 20. y X fareamount Linear Regression (example) distance 2 3 4 5 6 1 1 2 3 4 5 x y x - x̄ y - ȳ (x - x̄ )2 (x - x̄ )(y - ȳ) 1 2 -2 -2 4 4 2 4 -1 0 1 0 3 5 0 1 0 0 4 4 1 0 1 0 5 5 2 1 4 2 x̄ = 3 ȳ = 4 10 6 xw0 w1+=̂y w1 3w0 .6+=4 * w0 = 2.2 2.2 = ∑ (x − x)(y − y) ∑ (x − x)2 = 6 10 = .6
  • 21. = ∑ (x − x)(y − y) ∑ (x − x)2 = 6 10 = .6 y X fareamount Linear Regression (example) distance 2 3 4 5 6 1 1 2 3 4 5 x y x - x̄ y - ȳ (x - x̄ )2 (x - x̄ )(y - ȳ) 1 2 -2 -2 4 4 2 4 -1 0 1 0 3 5 0 1 0 0 4 4 1 0 1 0 5 5 2 1 4 2 x̄ = 3 ȳ = 4 10 6 xw0 w1+=̂y w1 3w0 .6+=4 * w0 = 2.2 2.2 Let’s return to our Colabs
  • 22. Decision Tree Algorithm By asking a simple question about value of independent variable it tries to predict a value of dependent variable
  • 23. Decision Tree Algorithm By asking a simple question about value of independent variable it tries to predict a value of dependent variable y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5
  • 24. Decision Tree Algorithm By asking a simple question about value of independent variable it tries to predict a value of dependent variable Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5
  • 25. Decision Tree Algorithm By asking a simple question about value of independent variable it tries to predict a value of dependent variable Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y False
  • 26. Decision Tree Algorithm By asking a simple question about value of independent variable it tries to predict a value of dependent variable Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True
  • 27. Decision Tree Algorithm By asking a simple question about value of independent variable it tries to predict a value of dependent variable Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True Root node
  • 28. Decision Tree Algorithm By asking a simple question about value of independent variable it tries to predict a value of dependent variable Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True Root node Left child Right child
  • 29. Decision Tree Algorithm By asking a simple question about value of independent variable it tries to predict a value of dependent variable Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True Root node Left child Right child Leafs
  • 30. Decision Tree Algorithm By asking a simple question about value of independent variable it tries to predict a value of dependent variable Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True
  • 31. Decision Tree Algorithm Here, X may correspond to any vertical line. Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True For example if X = 2.5: 2.5
  • 32. Decision Tree Algorithm Here, X may correspond to any vertical line. Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True For example if X = 2.5: 2.5 What are most reasonable values for Y and Z?
  • 33. Decision Tree Algorithm Here, X may correspond to any vertical line. Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True For example if X = 2.5: 2.5 What are most reasonable values for Y and Z (that minimise total MSE)?
  • 34. Decision Tree Algorithm What would be MSE if Y = 4 and Z = 5? Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 4 fare amount = 5 False True For example if X = 2.5: 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 4 Z = 5
  • 35. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 4 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 4 Z = 5 yi ̂yiMSE = 1 n n ∑ i=1 ( − )2
  • 36. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 4 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 4 Z = 5 yi ̂yiMSE = 1 n n ∑ i=1 ( − )2 real value predicted value
  • 37. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 4 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 4 Z = 5 MSE = 1 n n ∑ i=1 ( − )2 = (y1 − ̂y1)2 + (y2 − ̂y2)2 + (y3 − ̂y3)2 + (y4 − ̂y4)2 + (y5 − ̂y5)2 5 yi ̂yi 1 2 3 4 5
  • 38. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 4 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 4 Z = 5 MSE = 1 n n ∑ i=1 ( − )2 = (y1 − ̂y1)2 + (y2 − ̂y2)2 + (y3 − ̂y3)2 + (y4 − ̂y4)2 + (y5 − ̂y5)2 5 yi ̂yi 1 2 3 4 5
  • 39. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 4 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 4 Z = 5 MSE = 1 n n ∑ i=1 ( − )2 = (2)2 + (0)2 + (0)2 + (1)2 + (0)2 5 yi ̂yi 1 2 3 4 5
  • 40. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 4 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 4 Z = 5 MSE = 1 n n ∑ i=1 ( − )2 = 4 + 0 + 0 + 1 + 0 5 yi ̂yi 1 2 3 4 5
  • 41. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 4 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 4 Z = 5 MSE = 1 n n ∑ i=1 ( − )2 = 5 5 yi ̂yi 1 2 3 4 5
  • 42. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 4 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 4 Z = 5 MSE = 1 n n ∑ i=1 ( − )2 = 1yi ̂yi 1 2 3 4 5
  • 43. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 4 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 4 Z = 5 MSE = 1 n n ∑ i=1 ( − )2 = 1yi ̂yi 1 2 3 4 5 so, if X = 2.5, Y = 4 and Z = 5, MSE is 1
  • 44. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 4 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 4 Z = 5 MSE = 1 n n ∑ i=1 ( − )2 = 1yi ̂yi 1 2 3 4 5 Can we find better Y and Z? so, if X = 2.5, Y = 4 and Z = 5, MSE is 1
  • 45. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 3 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 3 Z = 5 1 2 3 4 5 MSE = 1 n n ∑ i=1 ( − )2 = (y1 − ̂y1)2 + (y2 − ̂y2)2 + (y3 − ̂y3)2 + (y4 − ̂y4)2 + (y5 − ̂y5)2 5 yi ̂yi
  • 46. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 3 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 3 Z = 5 1 2 3 4 5 MSE = 1 n n ∑ i=1 ( − )2 = (2 − 3)2 + (4 − 3)2 + (5 − 5)2 + (4 − 5)2 + (5 − 5)2 5 yi ̂yi
  • 47. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 3 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 3 Z = 5 1 2 3 4 5 MSE = 1 n n ∑ i=1 ( − )2 = 1 + 1 + 0 + 1 + 0 5 yi ̂yi
  • 48. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 3 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 3 Z = 5 1 2 3 4 5 MSE = 1 n n ∑ i=1 ( − )2 = 3 5 = 0.6yi ̂yi
  • 49. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 3 fare amount = 5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 3 Z = 5 1 2 3 4 5 MSE = 1 n n ∑ i=1 ( − )2 = 3 5 = 0.6yi ̂yi so, if X = 2.5, Y = 3 and Z = 5, MSE is 0.6
  • 50. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 3 fare amount = 4.5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 3 Z = 4.66 1 2 3 4 5 MSE = 1 n n ∑ i=1 ( − )2 = (2 − 3)2 + (4 − 3)2 + (5 − 4.66)2 + (4 − 4.66)2 + (5 − 4.66)2 5 yi ̂yi
  • 51. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 3 fare amount = 4.5 False True 2.5 What are most reasonable values for Y and Z (that minimise total MSE)? Y = 3 1 2 3 4 5 MSE = 1 n n ∑ i=1 ( − )2 = 1 + 1 + 0.12 + 0.43 + 0.12 5 yi ̂yi Z = 4.66
  • 52. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 3 fare amount = 4.5 False True 2.5 Y = 3 1 2 3 4 5 MSE = 1 n n ∑ i=1 ( − )2 = 2.67 5 = 0.53yi ̂yi so, if Y = 3 and Z = 4.5, MSE is smallest Are we happy? Z = 4.66
  • 53. Decision Tree Algorithm Is distance > 2.5 fare amount = 3 fare amount = 4.5 False True Hold on, how did we choose this split on the first place? y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 2.5 1 2 3 4 5
  • 54. Decision Tree Algorithm Is distance > 2.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 3 fare amount = 4.5 False True 2.5 1 2 3 4 5 Hold on, how did we choose this split on the first place? Maybe there are better options?
  • 55. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 What are the possible split options in this case?
  • 56. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 What are the possible split options in this case? 0.5 1.5 2.5 3.5 4.5 5.5
  • 57. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 Are these meaningful? 0.5 5.5
  • 58. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 How to compare remaining? 1.5 2.5 3.5 4.5
  • 59. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 How to compare remaining? For each one we can compute MSE ?? ? ?MSE 1.5 2.5 3.5 4.5
  • 60. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 How to compare remaining? For each one we can compute MSE 0.53? ? ? 1.5 2.5 3.5 4.5 MSE
  • 61. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 ? Y = 2 Z = 4.5 1.5 MSE
  • 62. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 (0 + 0.25 + 0.25 + 0.25 + 0.25)/5 = 0.2 Y = 2 Z = 4.5 1.5 MSE
  • 63. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 How to compare remaining? For each one we can compute MSE 0.2 ? ? 1.5 2.5 3.5 4.5 MSE 0.53
  • 64. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 ? 3.5 MSE Y = 3.66 Z = 4.5
  • 65. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 1.03 3.5 MSE Y = 3.66 Z = 4.5
  • 66. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 How to compare remaining? For each one we can compute MSE 0.2 1.03 ? 1.5 2.5 3.5 4.5 MSE 0.53
  • 67. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 ? 4.5 MSE Y = 3.75 Z = 5
  • 68. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 0.95 4.5 MSE Y = 3.75 Z = 5
  • 69. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 How to compare remaining? For each one we can compute MSE 0.2 1.03 0.95 1.5 2.5 3.5 4.5 MSE 0.53
  • 70. Decision Tree Algorithm Is distance > X y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = Y fare amount = Z False True 1 2 3 4 5 We choose the split that minimises total MSE 0.2 1.03 0.95 1.5 2.5 3.5 4.5 MSE 0.53
  • 71. Decision Tree Algorithm Is distance > 1.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 2 fare amount = 4.5 False True 1 2 3 4 5 Thus, the resulting tree: 0.2 1.5 MSE
  • 72. Decision Tree Algorithm Is distance > 1.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 fare amount = 2 fare amount = 4.5 False True 1 2 3 4 5 Can we make our decision tree more accurate? 0.2 1.5 MSE
  • 73. Decision Tree Algorithm distance > 1.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 False True 1 2 3 4 5 Can we make our decision tree more accurate? 0.2 1.5 MSE Yes, by going deeper! fare amount = 2 distance > X fare amount = Y fare amount = Z False True
  • 74. Decision Tree Algorithm distance > 1.5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 False True 1 2 3 4 5 Can we make our decision tree more accurate? 0.2 1.5 MSE Yes, by going deeper! fare amount = 2 distance > X fare amount = Y fare amount = Z False True Let’s return to our Colabs
  • 75. Overfitting y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 y X fareamount distance 2 3 4 5 6 1 1 2 3 4 5 Simple, but imperfect Complicated, but ideal VS
  • 76. Train/val split Initial dataset MSE = 1.0 Train dataset Randomly select 60% MSE = 0.0 Simple, but imperfect Complicated, but ideal Validation (val) dataset Randomly select 40% MSE = 2.5 MSE = 0.5
  • 78. POINTS 1. MACHINE LEARNING MODEL IS NOT MAGIC 2. YOU CAN SAVE AND LOAD ML MODELS 3. EVALUATING MODEL PERFORMANCE IS IMPORTANT 4. YOU MAY NEED TO RETRAIN YOUR MODELS