SlideShare una empresa de Scribd logo
1 de 21
Descargar para leer sin conexión
Lesson 12
Machine Learning - Regression Kush Kulshrestha
Introduction
It has long been known that crickets (an insect species) chirp more frequently on hotter days than on cooler days.
For decades, professional and amateur scientists have catalogued data on chirps-per-minute and temperature.
Using this data, you want to explore this relationship.
Introduction
As expected, the plot shows the temperature rising with the number of chirps. Is this relationship between chirps
and temperature linear?
Yes, you could draw a single straight line like the following to approximate this relationship: True, the line doesn't
pass through every dot, but the line does clearly show the relationship between chirps and temperature.
Introduction
Using the equation for a line, you could write down this
relationship as follows:
where,
- y is the temperature in Celsius—the value we're trying
to predict.
- m is the slope of the line.
- x is the number of chirps per minute—the value of our
input feature.
- b is the y-intercept
Introduction
By convention in machine learning, you'll write the above equation for a model slightly differently:
where,
- y’ is the predicted label (a desired output).
- b is the bias (the y-intercept), sometimes referred to as wo
- w1 is the weight of feature 1. Weight is the same concept as the "slope“ m in the traditional equation of the line.
- X1 is the feature.
To infer (predict) the temperature y’ for a new chirps-per-minute value x1, just substitute the x1 value into this
model.
Although this model uses only one feature, a more sophisticated model might rely on multiple features, each
having a separate weight (w1, w2 , etc.). For example, a model that relies on three features might look as follows:
What exactly is Training ?
Training a model simply means learning (determining) good values for all the weights and the bias from labelled
examples. In supervised learning, a machine learning algorithm builds a model by examining many examples and
attempting to find a model that minimizes loss.
Loss is the penalty for a bad prediction. That is, loss is a number indicating how bad the model's prediction was on
a single example. If the model's prediction is perfect, the loss is zero; otherwise, the loss is greater. The goal of
training a model is to find a set of weights and biases that have low loss, on average, across all examples.
Example, Figure 3 shows a high loss model on the left and a low loss model on the right.
Linear Regression
Linear regression is very good to answer the following questions:
• Is there a relationship between 2 variables?
• How strong is the relationship?
• Which variable contributes the most?
• How accurately can we estimate the effect of each variable?
• How accurately can we predict the target?
• Is the relationship linear?
• Is there an interaction effect?
Let’s assume we only have one variable and one target. Then, linear regression is expressed as:
In the equation above, the betas are the coefficients. These coefficients are what we need in order to make
predictions with our model.
To find the parameters, we need to minimize the least squares or the sum of squared errors.
Why do we use squared errors?
Linear Regression
In the below above, the red dots are the true data and the blue line is linear model. The grey lines illustrate the
errors between the predicted and the true values. The blue line is thus the one that minimizes the sum of the
squared length of the grey lines.
Linear Regression
After some math heavy lifting, you can finally estimate the coefficients with the following equations:
where x bar and y bar represent the mean.
Correlation coefficient
β1 can also be written as:
β1
where,
Sy and Sx are standard deviations of x and y values respectively, and r is the correlation coefficient defined as,
By examining the second equation for the estimated slope β1, we see that since sample standard deviations Sx and
Sy are positive quantities, the correlation coefficient (r), which is always between−1 and 1, measures how much x is
related to y and whether the trend is positive or negative. Figure 3.2 illustrates different correlation strengths.
Correlation coefficient
Figure below illustrates different correlation strengths.
An illustration of correlation strength. Each plot shows data with a particular correlation coefficient r. Values farther
than 0 (outside) indicate a stronger relationship than values closer to 0 (inside). Negative values (left) indicate an
inverse relationship, while positive values (right)indicate a direct relationship.
Coefficient of Determination
The square of the correlation coefficient, r^2 will always be positive and is called the coefficient of determination.
This also is equal to the proportion of the total variability that’s explained by a linear model.
As an extremely crucial remark, correlation does not imply causation!
Correlation and Causation
Just because there’s a strong correlation between two variables, there isn’t necessarily a causal relationship
between them.
For example, drowning deaths and ice-cream sales are strongly correlated, but that’s because both are affected by
the season (summer vs. winter). In general, there are several possible cases, as illustrated below:
1) Causal Link: Even if there is a causal link between x and y, correlation alone cannot tell us whether y causes x or x
causes y.
Correlation and Causation
2) Hidden Cause: A hidden variable z causes both x and y, creating the correlation.
3) Confounding Factor: A hidden variable z and x both affect y, so the results also depend on the value of z.
4) Coincidence: The correlation just happened by chance (e.g. the strong correlation between sun cycles and
number of Republicans in Congress)
Multiple Linear Regression
This is the case when instead of being single x value, we have a vector of x values (x1, x2, ……, xn) for every data
point i.
So, we have n data points (just like before), each with p different predictor variables or features. We’ll then try to
predict y for each data point as a linear function of the different x variables:
Even though it’s still linear, this representation is very versatile; here are just a few of thethings we can represent
with it:
• Multiple dependent variables: for example, suppose we’re trying to predict medical outcome as a function of
several variables such as age, genetic susceptibility, and clinical diagnosis. Then we might say that for each
patient, x1= age, x2= genetics, x3=diagnosis, and y= outcome.
• Nonlinearities: Suppose we want to predict a quadratic function y=ax^2 + bx + c, then for each data point we
might say x1= 1 , x2=x , and x3=x^2. This can easily be extended to any nonlinear function we want.
One may ask: why not just use multiple linear regression and fit an extremely high-degree polynomial to our data?
While the model then would be much richer, one runs the risk of overfitting
Multiple Linear Regression
Using too many features or too complex of a model can often lead to overfitting.
Suppose we want to fit a model to the points in Figure 1. If we fit a linear model, it might look like Figure 2. But, the
fit isn’t perfect. What if we use our newly acquired multiple regression powers to fit a 6th order polynomial to
these points? The result is shown in Figure 3
While our errors are definitely smaller than they were with the linear model, the new model is far too complex, and
will likely go wrong for values too far outside the range.
Application – Linear Regression in Scikit Learn
Please check out the jupyter notebook
Application – Linear Regression in Statsmodels
Please check out the jupyter notebook
Machine Learning Algorithm - Linear Regression
Machine Learning Algorithm - Linear Regression
Machine Learning Algorithm - Linear Regression

Más contenido relacionado

La actualidad más candente

Simple (and Simplistic) Introduction to Econometrics and Linear Regression
Simple (and Simplistic) Introduction to Econometrics and Linear RegressionSimple (and Simplistic) Introduction to Econometrics and Linear Regression
Simple (and Simplistic) Introduction to Econometrics and Linear RegressionPhilip Tiongson
 
Scatterplots, Correlation, and Regression
Scatterplots, Correlation, and RegressionScatterplots, Correlation, and Regression
Scatterplots, Correlation, and RegressionLong Beach City College
 
Scatter plots
Scatter plotsScatter plots
Scatter plotsswartzje
 
Displaying Distributions with Graphs
Displaying Distributions with GraphsDisplaying Distributions with Graphs
Displaying Distributions with Graphsnszakir
 
Graphs that Enlighten and Graphs that Deceive
Graphs that Enlighten and Graphs that DeceiveGraphs that Enlighten and Graphs that Deceive
Graphs that Enlighten and Graphs that DeceiveLong Beach City College
 
Scatter diagrams and correlation
Scatter diagrams and correlationScatter diagrams and correlation
Scatter diagrams and correlationkeithpeter
 
Chapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares RegressionChapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares Regressionnszakir
 
Scatter plot- Complete
Scatter plot- CompleteScatter plot- Complete
Scatter plot- CompleteIrfan Yaqoob
 
Machine Learning Algorithm - KNN
Machine Learning Algorithm - KNNMachine Learning Algorithm - KNN
Machine Learning Algorithm - KNNKush Kulshrestha
 
Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2
Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2
Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2Daniel Katz
 
Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...
Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...
Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...Daniel Katz
 
Geographical Skills
Geographical SkillsGeographical Skills
Geographical Skillsclemaitre
 
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5Daniel Katz
 
Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4
Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4
Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4Daniel Katz
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)Abhimanyu Dwivedi
 

La actualidad más candente (19)

Simple (and Simplistic) Introduction to Econometrics and Linear Regression
Simple (and Simplistic) Introduction to Econometrics and Linear RegressionSimple (and Simplistic) Introduction to Econometrics and Linear Regression
Simple (and Simplistic) Introduction to Econometrics and Linear Regression
 
Scatterplots, Correlation, and Regression
Scatterplots, Correlation, and RegressionScatterplots, Correlation, and Regression
Scatterplots, Correlation, and Regression
 
Scatter plots
Scatter plotsScatter plots
Scatter plots
 
Displaying Distributions with Graphs
Displaying Distributions with GraphsDisplaying Distributions with Graphs
Displaying Distributions with Graphs
 
Graphs that Enlighten and Graphs that Deceive
Graphs that Enlighten and Graphs that DeceiveGraphs that Enlighten and Graphs that Deceive
Graphs that Enlighten and Graphs that Deceive
 
Scatter diagrams and correlation
Scatter diagrams and correlationScatter diagrams and correlation
Scatter diagrams and correlation
 
Scattergrams
ScattergramsScattergrams
Scattergrams
 
Chapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares RegressionChapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares Regression
 
Scatter plot- Complete
Scatter plot- CompleteScatter plot- Complete
Scatter plot- Complete
 
Machine Learning Algorithm - KNN
Machine Learning Algorithm - KNNMachine Learning Algorithm - KNN
Machine Learning Algorithm - KNN
 
Histograms
HistogramsHistograms
Histograms
 
Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2
Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2
Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2
 
Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...
Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...
Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...
 
Scatter Plots
Scatter PlotsScatter Plots
Scatter Plots
 
Geographical Skills
Geographical SkillsGeographical Skills
Geographical Skills
 
assignment 2
assignment 2assignment 2
assignment 2
 
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
 
Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4
Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4
Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 

Similar a Machine Learning Algorithm - Linear Regression

REGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREREGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREShriramKargaonkar
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionAntony Raj
 
Chapter III.pptx
Chapter III.pptxChapter III.pptx
Chapter III.pptxBeamlak5
 
The future is uncertain. Some events do have a very small probabil.docx
The future is uncertain. Some events do have a very small probabil.docxThe future is uncertain. Some events do have a very small probabil.docx
The future is uncertain. Some events do have a very small probabil.docxoreo10
 
For this assignment, use the aschooltest.sav dataset.The d
For this assignment, use the aschooltest.sav dataset.The dFor this assignment, use the aschooltest.sav dataset.The d
For this assignment, use the aschooltest.sav dataset.The dMerrileeDelvalle969
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsDerek Kane
 
How to draw a good graph
How to draw a good graphHow to draw a good graph
How to draw a good graphTarun Gehlot
 
Requirements.docxRequirementsFont Times New RomanI NEED .docx
Requirements.docxRequirementsFont Times New RomanI NEED .docxRequirements.docxRequirementsFont Times New RomanI NEED .docx
Requirements.docxRequirementsFont Times New RomanI NEED .docxheunice
 
Maths A - Chapter 11
Maths A - Chapter 11Maths A - Chapter 11
Maths A - Chapter 11westy67968
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionAntony Raj
 
correlation and regression
correlation and regressioncorrelation and regression
correlation and regressionKeyur Tejani
 
Chapter 2 Simple Linear Regression Model.pptx
Chapter 2 Simple Linear Regression Model.pptxChapter 2 Simple Linear Regression Model.pptx
Chapter 2 Simple Linear Regression Model.pptxaschalew shiferaw
 
Chapter 9 Regression
Chapter 9 RegressionChapter 9 Regression
Chapter 9 Regressionghalan
 
RegressionMaking predictions using dataLimitations.docx
RegressionMaking predictions using dataLimitations.docxRegressionMaking predictions using dataLimitations.docx
RegressionMaking predictions using dataLimitations.docxdebishakespeare
 
TTests.ppt
TTests.pptTTests.ppt
TTests.pptMUzair21
 

Similar a Machine Learning Algorithm - Linear Regression (20)

REGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREREGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HERE
 
Chapter 5
Chapter 5Chapter 5
Chapter 5
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Chapter III.pptx
Chapter III.pptxChapter III.pptx
Chapter III.pptx
 
The future is uncertain. Some events do have a very small probabil.docx
The future is uncertain. Some events do have a very small probabil.docxThe future is uncertain. Some events do have a very small probabil.docx
The future is uncertain. Some events do have a very small probabil.docx
 
For this assignment, use the aschooltest.sav dataset.The d
For this assignment, use the aschooltest.sav dataset.The dFor this assignment, use the aschooltest.sav dataset.The d
For this assignment, use the aschooltest.sav dataset.The d
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
 
Correlation
CorrelationCorrelation
Correlation
 
How to draw a good graph
How to draw a good graphHow to draw a good graph
How to draw a good graph
 
Requirements.docxRequirementsFont Times New RomanI NEED .docx
Requirements.docxRequirementsFont Times New RomanI NEED .docxRequirements.docxRequirementsFont Times New RomanI NEED .docx
Requirements.docxRequirementsFont Times New RomanI NEED .docx
 
Maths A - Chapter 11
Maths A - Chapter 11Maths A - Chapter 11
Maths A - Chapter 11
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
correlation and regression
correlation and regressioncorrelation and regression
correlation and regression
 
Chapter 2 Simple Linear Regression Model.pptx
Chapter 2 Simple Linear Regression Model.pptxChapter 2 Simple Linear Regression Model.pptx
Chapter 2 Simple Linear Regression Model.pptx
 
Regression
RegressionRegression
Regression
 
Chapter 9 Regression
Chapter 9 RegressionChapter 9 Regression
Chapter 9 Regression
 
2-20-04.ppt
2-20-04.ppt2-20-04.ppt
2-20-04.ppt
 
RegressionMaking predictions using dataLimitations.docx
RegressionMaking predictions using dataLimitations.docxRegressionMaking predictions using dataLimitations.docx
RegressionMaking predictions using dataLimitations.docx
 
Regression
RegressionRegression
Regression
 
TTests.ppt
TTests.pptTTests.ppt
TTests.ppt
 

Más de Kush Kulshrestha

Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Kush Kulshrestha
 
Machine Learning Algorithm - Naive Bayes for Classification
Machine Learning Algorithm - Naive Bayes for ClassificationMachine Learning Algorithm - Naive Bayes for Classification
Machine Learning Algorithm - Naive Bayes for ClassificationKush Kulshrestha
 
Machine Learning Algorithm - Logistic Regression
Machine Learning Algorithm - Logistic RegressionMachine Learning Algorithm - Logistic Regression
Machine Learning Algorithm - Logistic RegressionKush Kulshrestha
 
Interpreting Regression Results - Machine Learning
Interpreting Regression Results - Machine LearningInterpreting Regression Results - Machine Learning
Interpreting Regression Results - Machine LearningKush Kulshrestha
 
General Concepts of Machine Learning
General Concepts of Machine LearningGeneral Concepts of Machine Learning
General Concepts of Machine LearningKush Kulshrestha
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsKush Kulshrestha
 
Wireless Charging of Electric Vehicles
Wireless Charging of Electric VehiclesWireless Charging of Electric Vehicles
Wireless Charging of Electric VehiclesKush Kulshrestha
 
Handshakes and their types
Handshakes and their typesHandshakes and their types
Handshakes and their typesKush Kulshrestha
 

Más de Kush Kulshrestha (12)

Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees
 
Machine Learning Algorithm - Naive Bayes for Classification
Machine Learning Algorithm - Naive Bayes for ClassificationMachine Learning Algorithm - Naive Bayes for Classification
Machine Learning Algorithm - Naive Bayes for Classification
 
Machine Learning Algorithm - Logistic Regression
Machine Learning Algorithm - Logistic RegressionMachine Learning Algorithm - Logistic Regression
Machine Learning Algorithm - Logistic Regression
 
Interpreting Regression Results - Machine Learning
Interpreting Regression Results - Machine LearningInterpreting Regression Results - Machine Learning
Interpreting Regression Results - Machine Learning
 
General Concepts of Machine Learning
General Concepts of Machine LearningGeneral Concepts of Machine Learning
General Concepts of Machine Learning
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
 
Inferential Statistics
Inferential StatisticsInferential Statistics
Inferential Statistics
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
Scaling and Normalization
Scaling and NormalizationScaling and Normalization
Scaling and Normalization
 
Wireless Charging of Electric Vehicles
Wireless Charging of Electric VehiclesWireless Charging of Electric Vehicles
Wireless Charging of Electric Vehicles
 
Time management
Time managementTime management
Time management
 
Handshakes and their types
Handshakes and their typesHandshakes and their types
Handshakes and their types
 

Último

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 

Último (20)

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 

Machine Learning Algorithm - Linear Regression

  • 1. Lesson 12 Machine Learning - Regression Kush Kulshrestha
  • 2. Introduction It has long been known that crickets (an insect species) chirp more frequently on hotter days than on cooler days. For decades, professional and amateur scientists have catalogued data on chirps-per-minute and temperature. Using this data, you want to explore this relationship.
  • 3. Introduction As expected, the plot shows the temperature rising with the number of chirps. Is this relationship between chirps and temperature linear? Yes, you could draw a single straight line like the following to approximate this relationship: True, the line doesn't pass through every dot, but the line does clearly show the relationship between chirps and temperature.
  • 4. Introduction Using the equation for a line, you could write down this relationship as follows: where, - y is the temperature in Celsius—the value we're trying to predict. - m is the slope of the line. - x is the number of chirps per minute—the value of our input feature. - b is the y-intercept
  • 5. Introduction By convention in machine learning, you'll write the above equation for a model slightly differently: where, - y’ is the predicted label (a desired output). - b is the bias (the y-intercept), sometimes referred to as wo - w1 is the weight of feature 1. Weight is the same concept as the "slope“ m in the traditional equation of the line. - X1 is the feature. To infer (predict) the temperature y’ for a new chirps-per-minute value x1, just substitute the x1 value into this model. Although this model uses only one feature, a more sophisticated model might rely on multiple features, each having a separate weight (w1, w2 , etc.). For example, a model that relies on three features might look as follows:
  • 6. What exactly is Training ? Training a model simply means learning (determining) good values for all the weights and the bias from labelled examples. In supervised learning, a machine learning algorithm builds a model by examining many examples and attempting to find a model that minimizes loss. Loss is the penalty for a bad prediction. That is, loss is a number indicating how bad the model's prediction was on a single example. If the model's prediction is perfect, the loss is zero; otherwise, the loss is greater. The goal of training a model is to find a set of weights and biases that have low loss, on average, across all examples. Example, Figure 3 shows a high loss model on the left and a low loss model on the right.
  • 7. Linear Regression Linear regression is very good to answer the following questions: • Is there a relationship between 2 variables? • How strong is the relationship? • Which variable contributes the most? • How accurately can we estimate the effect of each variable? • How accurately can we predict the target? • Is the relationship linear? • Is there an interaction effect? Let’s assume we only have one variable and one target. Then, linear regression is expressed as: In the equation above, the betas are the coefficients. These coefficients are what we need in order to make predictions with our model. To find the parameters, we need to minimize the least squares or the sum of squared errors. Why do we use squared errors?
  • 8. Linear Regression In the below above, the red dots are the true data and the blue line is linear model. The grey lines illustrate the errors between the predicted and the true values. The blue line is thus the one that minimizes the sum of the squared length of the grey lines.
  • 9. Linear Regression After some math heavy lifting, you can finally estimate the coefficients with the following equations: where x bar and y bar represent the mean.
  • 10. Correlation coefficient β1 can also be written as: β1 where, Sy and Sx are standard deviations of x and y values respectively, and r is the correlation coefficient defined as, By examining the second equation for the estimated slope β1, we see that since sample standard deviations Sx and Sy are positive quantities, the correlation coefficient (r), which is always between−1 and 1, measures how much x is related to y and whether the trend is positive or negative. Figure 3.2 illustrates different correlation strengths.
  • 11. Correlation coefficient Figure below illustrates different correlation strengths. An illustration of correlation strength. Each plot shows data with a particular correlation coefficient r. Values farther than 0 (outside) indicate a stronger relationship than values closer to 0 (inside). Negative values (left) indicate an inverse relationship, while positive values (right)indicate a direct relationship.
  • 12. Coefficient of Determination The square of the correlation coefficient, r^2 will always be positive and is called the coefficient of determination. This also is equal to the proportion of the total variability that’s explained by a linear model. As an extremely crucial remark, correlation does not imply causation!
  • 13. Correlation and Causation Just because there’s a strong correlation between two variables, there isn’t necessarily a causal relationship between them. For example, drowning deaths and ice-cream sales are strongly correlated, but that’s because both are affected by the season (summer vs. winter). In general, there are several possible cases, as illustrated below: 1) Causal Link: Even if there is a causal link between x and y, correlation alone cannot tell us whether y causes x or x causes y.
  • 14. Correlation and Causation 2) Hidden Cause: A hidden variable z causes both x and y, creating the correlation. 3) Confounding Factor: A hidden variable z and x both affect y, so the results also depend on the value of z. 4) Coincidence: The correlation just happened by chance (e.g. the strong correlation between sun cycles and number of Republicans in Congress)
  • 15. Multiple Linear Regression This is the case when instead of being single x value, we have a vector of x values (x1, x2, ……, xn) for every data point i. So, we have n data points (just like before), each with p different predictor variables or features. We’ll then try to predict y for each data point as a linear function of the different x variables: Even though it’s still linear, this representation is very versatile; here are just a few of thethings we can represent with it: • Multiple dependent variables: for example, suppose we’re trying to predict medical outcome as a function of several variables such as age, genetic susceptibility, and clinical diagnosis. Then we might say that for each patient, x1= age, x2= genetics, x3=diagnosis, and y= outcome. • Nonlinearities: Suppose we want to predict a quadratic function y=ax^2 + bx + c, then for each data point we might say x1= 1 , x2=x , and x3=x^2. This can easily be extended to any nonlinear function we want. One may ask: why not just use multiple linear regression and fit an extremely high-degree polynomial to our data? While the model then would be much richer, one runs the risk of overfitting
  • 16. Multiple Linear Regression Using too many features or too complex of a model can often lead to overfitting. Suppose we want to fit a model to the points in Figure 1. If we fit a linear model, it might look like Figure 2. But, the fit isn’t perfect. What if we use our newly acquired multiple regression powers to fit a 6th order polynomial to these points? The result is shown in Figure 3 While our errors are definitely smaller than they were with the linear model, the new model is far too complex, and will likely go wrong for values too far outside the range.
  • 17. Application – Linear Regression in Scikit Learn Please check out the jupyter notebook
  • 18. Application – Linear Regression in Statsmodels Please check out the jupyter notebook