SlideShare una empresa de Scribd logo
1 de 59
Chapter 5 Summarizing Bivariate Data
[object Object],[object Object],Terms
[object Object],[object Object],Terms
Scatterplots ,[object Object],When one of the variables is considered to be a response variable (y) and the other an explanatory variable (x).  The explanatory variable is usually plotted on the x axis.
Example A sample of one-way Greyhound bus fares from Rochester, NY to cities less than 750 miles was taken by going to Greyhound’s website. The following table gives the destination city, the distance and the one-way fare. Distance should be the x axis and the Fare should be the y axis.
Example Scatterplot
Comments ,[object Object],[object Object],[object Object],[object Object],[object Object]
Further Comments ,[object Object],[object Object],[object Object]
Association ,[object Object],[object Object]
The Pearson Correlation Coefficient ,[object Object],[object Object]
Example Calculation
Some Correlation Pictures
Some Correlation Pictures
Some Correlation Pictures
Some Correlation Pictures
Some Correlation Pictures
Some Correlation Pictures
Properties of r ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
An Interesting Example Consider the following bivariate data set:
An Interesting Example Computing the Pearson  correlation  coefficient, we find that r = 0.001
An Interesting Example With a sample Pearson correlation coefficient, r = 0.001, one would note that there seems to be little or no linearity to the relationship between x and y. Be careful that you do not infer that there is no relationship between x and y.
An Interesting Example Note (below) that there appears to be an almost perfect quadratic relationship between x and y when the scatterplot is drawn.
Linear Relations ,[object Object],[object Object],[object Object]
Example x y 0 2 4 6 8 0 5 10 15 y = 7 + 3x a = 7 x increases by 1 y increases by b = 3
Example y y = 17 - 4x x increases by 1 y changes by b = -4 (i.e., changes by –4) 0 2 4 6 8 0 5 10 15 a = 17
Least Squares Line The line that gives the best fit to the data is the one that minimizes this sum;  it is called the  least squares line  or  sample regression line . The most widely used criterion for measuring the goodness of fit of a line  y = a + bx  to bivariate data (x 1 , y 1 ),  (x 2 , y 2 ),  , (x n , y n ) is the sum of the of the squared deviations about the line:
Coefficients a and b The slope of the least squares line is And the y intercept is We write the equation of the least squares line as where the ^ above y emphasizes that  (read as y-hat) is a prediction of y resulting from the substitution of a particular value into the equation.
Calculating Formula for b
Greyhound Example Continued
Calculations From the previous slide, we have
Minitab Graph The following graph is a copy of the output from a Minitab command to graph the regression line.
Greyhound Example Revisited
Greyhound Example Revisited Using the calculation formula we have: Notice that we get the same result.
Three Important Questions  ,[object Object],[object Object],[object Object],[object Object]
Terminology
Greyhound Example Continued
Residual Plot ,[object Object]
Residual Plot - What to look for. ,[object Object],[object Object],This residual plot indicates no systematic bias using the least squares line to predict the y value. Generally this is the kind of pattern that you would like to see. ,[object Object],[object Object],[object Object]
The Greyhound example continued For the Greyhound example, it appears that the line systematically predicts fares that are too high for cities close to Rochester and predicts fares that are too little for most cities between 200 and 500 miles. Predicted fares are too high. Predicted fares are too low.
More Residual Plots
Definition formulae
Calculational formulae SSTo  and  SSResid  are generally found as part of the standard output from most statistical packages or can be obtained using the following computational formulas:
Coefficient of Determination ,[object Object],[object Object]
Greyhound Example Revisited
Greyhound Example Revisited We can say that 93.5% of the variation in the Fare (y) can be attributed to the least squares linear relationship between distance (x) and fare.
More on variability The  standard deviation about the least squares line  is denoted s e  and given by  s e  is interpreted as the “typical” amount by which an observation deviates from the least squares line.
Greyhound Example Revisited The “typical” deviation of actual fare from the prediction is $6.80.
Minitab output for Regression Regression Analysis: Standard Fare versus Distance The regression equation is Standard Fare = 10.1 + 0.150 Distance Predictor  Coef  SE Coef  T  P Constant  10.138  4.327  2.34  0.039 Distance  0.15018  0.01196  12.56  0.000 S = 6.803  R-Sq = 93.5%  R-Sq(adj) = 92.9% Analysis of Variance Source  DF  SS  MS  F  P Regression  1  7298.1  7298.1  157.68  0.000 Residual Error  11  509.1  46.3 Total  12  7807.2 SSTo SSResid s e r 2 a b Least squares regression line
The Greyhound problem  with additional data The sample of fares and mileages from Rochester was extended to cover a total of 20 cities throughout the country. The resulting data and a scatterplot are given on the next few slides.
Extended Greyhound Fare Example
Extended Greyhound Fare Example
Extended Greyhound Fare Example Minitab reports the correlation coefficient, r=0.921,  R 2 =0.849, s e =$17.42 and the regression line Standard Fare = 46.058 +  0.043535 Distance Notice that even though the correlation coefficient is reasonably high and 84.9 % of the variation in the Fare is explained, the linear model is not very usable.
Nonlinear Regression Example
Nonlinear Regression Example From the previous slide we can see that the plot does not look linear, it appears to have a curved shape. We sometimes replace the one of both of the variables with a transformation of that variable and then perform a linear regression on the transformed variables. This can sometimes lead to developing a useful prediction equation. For this particular data, the shape of the curve is almost logarithmic so we might try to replace the distance with log 10 (distance) [the logarithm to the base 10) of the distance].
Nonlinear Regression Example Minitab provides the following output. Regression Analysis: Standard Fare versus Log10(Distance) The regression equation is Standard Fare = - 163 + 91.0 Log10(Distance) Predictor  Coef  SE Coef  T  P Constant  -163.25  10.59  -15.41  0.000 Log10(Di  91.039  3.826  23.80  0.000 S = 7.869  R-Sq = 96.9%  R-Sq(adj) = 96.7% High r 2 96.9% of the variation attributed to the model Typical Error = $7.87 Reasonably good
Nonlinear Regression Example The rest of the Minitab output follows Analysis of Variance Source  DF  SS  MS  F  P Regression  1  35068  35068  566.30  0.000 Residual Error  18  1115  62 Total  19  36183 Unusual Observations Obs  Log10(Di  Standard  Fit  SE Fit  Residual  St Resid 11  3.17  109.00  125.32  2.43  -16.32  -2.18R  R denotes an observation with a large standardized residual The only outlier is Orlando and as you’ll see from the next two slides, it is not too bad.
Nonlinear Regression Example Looking at the plot of the residuals against distance, we see some problems. The model over estimates fares for middle distances (1000 to 2000 miles) and under estimates for longer distances (more than 2000 miles
Nonlinear Regression Example When we look at how the prediction curve looks on a graph that has the Standard Fare and log10(Distance) axes, we see the result looks reasonably linear.
Nonlinear Regression Example When we look at how the prediction curve looks on a graph that has the Standard Fare and Distance axes, we see the result appears to work fairly well.  By and large, this prediction model for the fares appears to work reasonable well.

Más contenido relacionado

La actualidad más candente

PG STAT 531 Lecture 3 Graphical and Diagrammatic Representation of Data
PG STAT 531 Lecture 3 Graphical and Diagrammatic Representation of DataPG STAT 531 Lecture 3 Graphical and Diagrammatic Representation of Data
PG STAT 531 Lecture 3 Graphical and Diagrammatic Representation of DataAashish Patel
 
Diagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelDiagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelMehdi Shayegani
 
Linear regression analysis
Linear regression analysisLinear regression analysis
Linear regression analysisNimrita Koul
 
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierRegression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierAl Arizmendez
 
Chapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares RegressionChapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares Regressionnszakir
 
Chapter 4 part1-Probability Model
Chapter 4 part1-Probability ModelChapter 4 part1-Probability Model
Chapter 4 part1-Probability Modelnszakir
 
Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...
Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...
Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...Daniel Katz
 
Presentation On Regression
Presentation On RegressionPresentation On Regression
Presentation On Regressionalok tiwari
 
Testing a Claim About a Standard Deviation or Variance
Testing a Claim About a Standard Deviation or VarianceTesting a Claim About a Standard Deviation or Variance
Testing a Claim About a Standard Deviation or VarianceLong Beach City College
 

La actualidad más candente (20)

Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
 
Chapter9
Chapter9Chapter9
Chapter9
 
Linear regression
Linear regressionLinear regression
Linear regression
 
PG STAT 531 Lecture 3 Graphical and Diagrammatic Representation of Data
PG STAT 531 Lecture 3 Graphical and Diagrammatic Representation of DataPG STAT 531 Lecture 3 Graphical and Diagrammatic Representation of Data
PG STAT 531 Lecture 3 Graphical and Diagrammatic Representation of Data
 
Diagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelDiagnostic methods for Building the regression model
Diagnostic methods for Building the regression model
 
Chapter13
Chapter13Chapter13
Chapter13
 
Chapter14
Chapter14Chapter14
Chapter14
 
Chapter7
Chapter7Chapter7
Chapter7
 
Probability concept and Probability distribution_Contd
Probability concept and Probability distribution_ContdProbability concept and Probability distribution_Contd
Probability concept and Probability distribution_Contd
 
Chapter15
Chapter15Chapter15
Chapter15
 
Ridge regression
Ridge regressionRidge regression
Ridge regression
 
Linear regression analysis
Linear regression analysisLinear regression analysis
Linear regression analysis
 
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierRegression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
 
Inferences about Two Proportions
 Inferences about Two Proportions Inferences about Two Proportions
Inferences about Two Proportions
 
Chapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares RegressionChapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares Regression
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Chapter 4 part1-Probability Model
Chapter 4 part1-Probability ModelChapter 4 part1-Probability Model
Chapter 4 part1-Probability Model
 
Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...
Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...
Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...
 
Presentation On Regression
Presentation On RegressionPresentation On Regression
Presentation On Regression
 
Testing a Claim About a Standard Deviation or Variance
Testing a Claim About a Standard Deviation or VarianceTesting a Claim About a Standard Deviation or Variance
Testing a Claim About a Standard Deviation or Variance
 

Destacado

Fitting polynomial data
Fitting polynomial dataFitting polynomial data
Fitting polynomial dataBart Lauwers
 
Dissertation Paper
Dissertation PaperDissertation Paper
Dissertation PaperJames McCabe
 
Multivariate Techniques
Multivariate TechniquesMultivariate Techniques
Multivariate TechniquesTerry Chaney
 
SAPC 2009 - Patient satisfaction with Primary Care
SAPC 2009 - Patient satisfaction with Primary CareSAPC 2009 - Patient satisfaction with Primary Care
SAPC 2009 - Patient satisfaction with Primary CareEvangelos Kontopantelis
 
Chapt 11 & 12 linear & multiple regression minitab
Chapt 11 & 12 linear &  multiple regression minitabChapt 11 & 12 linear &  multiple regression minitab
Chapt 11 & 12 linear & multiple regression minitabBoyu Deng
 
A dissertation report on analysis of patient satisfaction max polyclinic by ...
A  dissertation report on analysis of patient satisfaction max polyclinic by ...A  dissertation report on analysis of patient satisfaction max polyclinic by ...
A dissertation report on analysis of patient satisfaction max polyclinic by ...Mohammed Yaser Hussain
 
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...Alexander Efremov
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkDB Tsai
 
Logistic regression with SPSS examples
Logistic regression with SPSS examplesLogistic regression with SPSS examples
Logistic regression with SPSS examplesGaurav Kamboj
 
Intro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMIntro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMNYC Predictive Analytics
 
Methods of multivariate analysis
Methods of multivariate analysisMethods of multivariate analysis
Methods of multivariate analysisharamaya university
 
Multivariate Analysis An Overview
Multivariate Analysis An OverviewMultivariate Analysis An Overview
Multivariate Analysis An Overviewguest3311ed
 
Multivariate Analysis Techniques
Multivariate Analysis TechniquesMultivariate Analysis Techniques
Multivariate Analysis TechniquesMehul Gondaliya
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regressionJames Neill
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionDrZahid Khan
 
Slideshare Powerpoint presentation
Slideshare Powerpoint presentationSlideshare Powerpoint presentation
Slideshare Powerpoint presentationelliehood
 

Destacado (19)

Fitting polynomial data
Fitting polynomial dataFitting polynomial data
Fitting polynomial data
 
Dissertation Paper
Dissertation PaperDissertation Paper
Dissertation Paper
 
Multivariate Techniques
Multivariate TechniquesMultivariate Techniques
Multivariate Techniques
 
SAPC 2009 - Patient satisfaction with Primary Care
SAPC 2009 - Patient satisfaction with Primary CareSAPC 2009 - Patient satisfaction with Primary Care
SAPC 2009 - Patient satisfaction with Primary Care
 
Chapt 11 & 12 linear & multiple regression minitab
Chapt 11 & 12 linear &  multiple regression minitabChapt 11 & 12 linear &  multiple regression minitab
Chapt 11 & 12 linear & multiple regression minitab
 
A dissertation report on analysis of patient satisfaction max polyclinic by ...
A  dissertation report on analysis of patient satisfaction max polyclinic by ...A  dissertation report on analysis of patient satisfaction max polyclinic by ...
A dissertation report on analysis of patient satisfaction max polyclinic by ...
 
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache Spark
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Multivariate Analysis
Multivariate AnalysisMultivariate Analysis
Multivariate Analysis
 
Logistic regression with SPSS examples
Logistic regression with SPSS examplesLogistic regression with SPSS examples
Logistic regression with SPSS examples
 
Intro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMIntro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVM
 
Methods of multivariate analysis
Methods of multivariate analysisMethods of multivariate analysis
Methods of multivariate analysis
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Multivariate Analysis An Overview
Multivariate Analysis An OverviewMultivariate Analysis An Overview
Multivariate Analysis An Overview
 
Multivariate Analysis Techniques
Multivariate Analysis TechniquesMultivariate Analysis Techniques
Multivariate Analysis Techniques
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Slideshare Powerpoint presentation
Slideshare Powerpoint presentationSlideshare Powerpoint presentation
Slideshare Powerpoint presentation
 

Similar a Chapter05

Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inferenceKemal İnciroğlu
 
Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).pptMuhammadAftab89
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.pptRidaIrfan10
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.pptkrunal soni
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.pptMoinPasha12
 
Correlation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social ScienceCorrelation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social Sciencessuser71ac73
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionAntony Raj
 
Exploring bivariate data
Exploring bivariate dataExploring bivariate data
Exploring bivariate dataUlster BOCES
 
Correlation Analysis PRESENTED.pptx
Correlation Analysis PRESENTED.pptxCorrelation Analysis PRESENTED.pptx
Correlation Analysis PRESENTED.pptxHaimanotReta
 
Linear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec domsLinear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec domsBabasab Patil
 
ML-UNIT-IV complete notes download here
ML-UNIT-IV  complete notes download hereML-UNIT-IV  complete notes download here
ML-UNIT-IV complete notes download herekeerthanakshatriya20
 

Similar a Chapter05 (20)

Corr And Regress
Corr And RegressCorr And Regress
Corr And Regress
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inference
 
Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Correlation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social ScienceCorrelation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social Science
 
regression.pptx
regression.pptxregression.pptx
regression.pptx
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Exploring bivariate data
Exploring bivariate dataExploring bivariate data
Exploring bivariate data
 
Chap04 01
Chap04 01Chap04 01
Chap04 01
 
Math n Statistic
Math n StatisticMath n Statistic
Math n Statistic
 
Correlation Analysis PRESENTED.pptx
Correlation Analysis PRESENTED.pptxCorrelation Analysis PRESENTED.pptx
Correlation Analysis PRESENTED.pptx
 
Linear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec domsLinear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec doms
 
Correlation and Regression
Correlation and Regression Correlation and Regression
Correlation and Regression
 
Regression
RegressionRegression
Regression
 
ML-UNIT-IV complete notes download here
ML-UNIT-IV  complete notes download hereML-UNIT-IV  complete notes download here
ML-UNIT-IV complete notes download here
 
9. parametric regression
9. parametric regression9. parametric regression
9. parametric regression
 

Más de rwmiller

Más de rwmiller (11)

Chapter06
Chapter06Chapter06
Chapter06
 
Chapter12
Chapter12Chapter12
Chapter12
 
Chapter10
Chapter10Chapter10
Chapter10
 
Chapter09
Chapter09Chapter09
Chapter09
 
Chapter07
Chapter07Chapter07
Chapter07
 
Chapter04
Chapter04Chapter04
Chapter04
 
Chapter03
Chapter03Chapter03
Chapter03
 
Chapter02
Chapter02Chapter02
Chapter02
 
Chapter01
Chapter01Chapter01
Chapter01
 
Chapter02
Chapter02Chapter02
Chapter02
 
Chapter01
Chapter01Chapter01
Chapter01
 

Último

Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 

Último (20)

Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

Chapter05

  • 1. Chapter 5 Summarizing Bivariate Data
  • 2.
  • 3.
  • 4.
  • 5. Example A sample of one-way Greyhound bus fares from Rochester, NY to cities less than 750 miles was taken by going to Greyhound’s website. The following table gives the destination city, the distance and the one-way fare. Distance should be the x axis and the Fare should be the y axis.
  • 7.
  • 8.
  • 9.
  • 10.
  • 18.
  • 19. An Interesting Example Consider the following bivariate data set:
  • 20. An Interesting Example Computing the Pearson correlation coefficient, we find that r = 0.001
  • 21. An Interesting Example With a sample Pearson correlation coefficient, r = 0.001, one would note that there seems to be little or no linearity to the relationship between x and y. Be careful that you do not infer that there is no relationship between x and y.
  • 22. An Interesting Example Note (below) that there appears to be an almost perfect quadratic relationship between x and y when the scatterplot is drawn.
  • 23.
  • 24. Example x y 0 2 4 6 8 0 5 10 15 y = 7 + 3x a = 7 x increases by 1 y increases by b = 3
  • 25. Example y y = 17 - 4x x increases by 1 y changes by b = -4 (i.e., changes by –4) 0 2 4 6 8 0 5 10 15 a = 17
  • 26. Least Squares Line The line that gives the best fit to the data is the one that minimizes this sum; it is called the least squares line or sample regression line . The most widely used criterion for measuring the goodness of fit of a line y = a + bx to bivariate data (x 1 , y 1 ), (x 2 , y 2 ),  , (x n , y n ) is the sum of the of the squared deviations about the line:
  • 27. Coefficients a and b The slope of the least squares line is And the y intercept is We write the equation of the least squares line as where the ^ above y emphasizes that (read as y-hat) is a prediction of y resulting from the substitution of a particular value into the equation.
  • 30. Calculations From the previous slide, we have
  • 31. Minitab Graph The following graph is a copy of the output from a Minitab command to graph the regression line.
  • 33. Greyhound Example Revisited Using the calculation formula we have: Notice that we get the same result.
  • 34.
  • 37.
  • 38.
  • 39. The Greyhound example continued For the Greyhound example, it appears that the line systematically predicts fares that are too high for cities close to Rochester and predicts fares that are too little for most cities between 200 and 500 miles. Predicted fares are too high. Predicted fares are too low.
  • 42. Calculational formulae SSTo and SSResid are generally found as part of the standard output from most statistical packages or can be obtained using the following computational formulas:
  • 43.
  • 45. Greyhound Example Revisited We can say that 93.5% of the variation in the Fare (y) can be attributed to the least squares linear relationship between distance (x) and fare.
  • 46. More on variability The standard deviation about the least squares line is denoted s e and given by s e is interpreted as the “typical” amount by which an observation deviates from the least squares line.
  • 47. Greyhound Example Revisited The “typical” deviation of actual fare from the prediction is $6.80.
  • 48. Minitab output for Regression Regression Analysis: Standard Fare versus Distance The regression equation is Standard Fare = 10.1 + 0.150 Distance Predictor Coef SE Coef T P Constant 10.138 4.327 2.34 0.039 Distance 0.15018 0.01196 12.56 0.000 S = 6.803 R-Sq = 93.5% R-Sq(adj) = 92.9% Analysis of Variance Source DF SS MS F P Regression 1 7298.1 7298.1 157.68 0.000 Residual Error 11 509.1 46.3 Total 12 7807.2 SSTo SSResid s e r 2 a b Least squares regression line
  • 49. The Greyhound problem with additional data The sample of fares and mileages from Rochester was extended to cover a total of 20 cities throughout the country. The resulting data and a scatterplot are given on the next few slides.
  • 52. Extended Greyhound Fare Example Minitab reports the correlation coefficient, r=0.921, R 2 =0.849, s e =$17.42 and the regression line Standard Fare = 46.058 + 0.043535 Distance Notice that even though the correlation coefficient is reasonably high and 84.9 % of the variation in the Fare is explained, the linear model is not very usable.
  • 54. Nonlinear Regression Example From the previous slide we can see that the plot does not look linear, it appears to have a curved shape. We sometimes replace the one of both of the variables with a transformation of that variable and then perform a linear regression on the transformed variables. This can sometimes lead to developing a useful prediction equation. For this particular data, the shape of the curve is almost logarithmic so we might try to replace the distance with log 10 (distance) [the logarithm to the base 10) of the distance].
  • 55. Nonlinear Regression Example Minitab provides the following output. Regression Analysis: Standard Fare versus Log10(Distance) The regression equation is Standard Fare = - 163 + 91.0 Log10(Distance) Predictor Coef SE Coef T P Constant -163.25 10.59 -15.41 0.000 Log10(Di 91.039 3.826 23.80 0.000 S = 7.869 R-Sq = 96.9% R-Sq(adj) = 96.7% High r 2 96.9% of the variation attributed to the model Typical Error = $7.87 Reasonably good
  • 56. Nonlinear Regression Example The rest of the Minitab output follows Analysis of Variance Source DF SS MS F P Regression 1 35068 35068 566.30 0.000 Residual Error 18 1115 62 Total 19 36183 Unusual Observations Obs Log10(Di Standard Fit SE Fit Residual St Resid 11 3.17 109.00 125.32 2.43 -16.32 -2.18R R denotes an observation with a large standardized residual The only outlier is Orlando and as you’ll see from the next two slides, it is not too bad.
  • 57. Nonlinear Regression Example Looking at the plot of the residuals against distance, we see some problems. The model over estimates fares for middle distances (1000 to 2000 miles) and under estimates for longer distances (more than 2000 miles
  • 58. Nonlinear Regression Example When we look at how the prediction curve looks on a graph that has the Standard Fare and log10(Distance) axes, we see the result looks reasonably linear.
  • 59. Nonlinear Regression Example When we look at how the prediction curve looks on a graph that has the Standard Fare and Distance axes, we see the result appears to work fairly well. By and large, this prediction model for the fares appears to work reasonable well.