Describing Distributions with Numbers

N
1

INTRODUCTION TO STATISTICS &
PROBABILITY
Chapter 1:

Looking at Data—Distributions (Part 2)
1.2 Describing Distributions with Numbers

Dr. Nahid Sultana
1.2 Describing Distributions with
Numbers
2

Objectives

 Measures of center: mean, median
 Measures of spread: quartiles, standard deviation
 Five-number summary and boxplot

 IQR and outliers
 Choosing among summary statistics

 Changing the unit of measurement
Measures of center: The Mean
3

 The most common measure of center is the arithmetic
average, or mean, or sample mean.
 To calculate the average, or mean, add all values, then
divide by the number of individuals.
 It is the “center of mass.”
 If the n observations are x1, x2, x3, …, xn, their mean is:
sum of observations x1  x2  ...  xn
x

n
n
1
or in more compact notation, x  n  xi
Measures of center: The Mean
(cont…)
4

Find the mean:
Here are the scores on the first exam in an introductory
statistics course for 10 students:

80

73

92

85

75

98

93

55

Find the mean first-exam score for these students.
Solution:

80

90
Measuring Center: The Median
5

 Another common measure of center is the median.

 The median M is the midpoint of a distribution, the
number such that half of the observations are smaller
and the other half are larger.
To find the median of a distribution:
1. Arrange all observations from smallest to largest.
2. If the number of observations n is odd, the median M is the
center observation in the ordered list.
3. If the number of observations n is even, the median M is the
average of the two center observations in the ordered list.
Measuring Center: The Median (cont...)
6

Find the median:
Here are the scores on the first exam in an introductory
statistics course for 10 students:
80 73
92
85
75
98
93
55
80
Find the median first-exam score for these students.
Solution:

90

Note: The location of the median is (n + 1)/2 in the sorted list.
Comparing Mean and Median
7
Comparing Mean and Median (Cont...)
8

 The mean and the median are the same only if the distribution is
symmetrical.

 In a skewed
distribution, the mean is
usually farther out in
the long tail than is the
median.

 The median is a measure of center that is resistant to skew and
outliers. The mean is not.
Measuring Spread: The Quartiles
9

A measure of center alone can be misleading. A useful numerical
description of a distribution requires both a measure of center and a
measure of spread.
 We describe the spread or variability of a distribution by giving
several percentiles.
 The median divides the data in two parts; half of the observations
are above the median and half are below the median. We could
call the median the 50th percentile.
 The lower quartile (first quartile, Q1)is the median of the lower
half of the data; the upper quartile (third quartile, Q3) is the
median of the upper half of the data.
 With the median, the quartiles divide the data into four equal
parts; 25% of the data are in each part
Measuring Spread: The Quartiles (Cont.)
Calculate the quartiles and inter-quartile:

10

1. Arrange the observations in
increasing order and locate
the median M.
2. The first quartile Q1 is the
median of the lower half of
the data, excluding M.

3. The third quartile Q3 is it is
the median of the upper half
of the data, excluding M.
Measuring Spread: The Quartiles
(Cont.)
11

Example: Here are the scores on the first-exam in an introductory
statistics course for 10 students:
80 73
92
85
75
98
93
55
80
90
Find the quartiles for these first-exam scores.
Solution: In order, the scores are:
55 73
75
80
80
85
90
92
93
98
The median is,
Q1 = 75, the median of the first five numbers: 55, 73, 75, 80, 80.
Q3 = 92, the median of the last five numbers: 85, 90, 92, 93, 98.
The Five-Number Summary
12

The five-number summary of a distribution consists of
 The smallest observation (Min)
 The first quartile (Q1)
 The median (M)
 The third quartile (Q3)
 The largest observation (Max)
written in order from smallest to largest.

Minimum

Q1

M

Q3

Maximum
Boxplots
13

A boxplot is a graph of the five-number summary.
 Draw a central box from Q1 to Q3.
 Draw a line inside the box to mark the median M.
 Extend lines from the box out to the minimum and maximum
values that are not outliers.
Boxplots (Cont…)
14

Example: Here are the scores on the first-exam in an introductory
statistics course for 10 students:
80 73
92
85
75

98
93
Make a boxplot for these first-exam scores.
Solution: In order, the scores are:
55, 73, 75, 80, 80, 85, 90, 92, 93, 98
Min = 55
Q1 = 75
M = 82.5
Q3 = 92
Max = 98

55

80

90
Comparing Boxplots to Histograms
15
15
Boxplots and skewed data
16

Years until death

Boxplots for a symmetric and a right-skewed distribution
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

Boxplots show

symmetry or skew.

Disease X

Multiple Myeloma
Suspected Outliers: 1.5  IQR Rule
17

 Outliers are troublesome data points, and it is important to be
able to identify them.
The interquartile range IQR is the distance between the first and
third quartiles,
IQR = Q3 − Q1

 IQR is used as part of a rule of thumb for identifying outliers.
The 1.5  IQR Rule for Outliers
Call an observation an outlier if it falls more than 1.5  IQR above
the third quartile or below the first quartile.

 Suspected low outlier: any value < Q1 – 1.5  IQR
 Suspected high outlier: any value > Q3 + 1.5  IQR
Suspected Outliers: 1.5  IQR Rule (Cont..)
18

Individual #25 has a value of 7.9 years, which is 3.55 years
above the third quartile. This is more than 1.5 * IQR =3.225
years. Thus, individual #25 is a suspected outlier.
Suspected Outliers: 1.5  IQR Rule (Cont..)
19

 Modified boxplots plot suspected outliers individually.

 The 8 largest call lengths are
438, 465, 479, 700, 700, 951, 1148, 2631
 They are plotted as individual points, though 2 of them are
identical and so do not appear separately.
Measuring Spread:
The Standard Deviation

20
The most common measure of spread looks at how far each
observation is from the mean. This measure is called the standard
deviation.

 The standard deviation s measures the average distance of the
observations from their mean.
 It is calculated by

 This average squared distance is called the variance.
Calculating The Standard Deviation
21
1. Calculate mean
2. Calculate each deviation,
deviation = observation – mean
3. Square each deviation
4. Calculate the sum of the squared
deviations
5. Divided by degrees freedom,
(df) = (n-1), this is called the variance.
6. Calculate the square root of the
variance…this is the standard
deviation.

The variance = 52/(9 – 1) = 6.5
Standard deviation = 6.5 = 2.55

xi

(xi-mean) (xi-mean)2

1

1 - 5 = -4

(-4)2 = 16

3

3 - 5 = -2

(-2)2 = 4

4

4 - 5 = -1

(-1)2 = 1

4

4 - 5 = -1

(-1)2 = 1

4

4 - 5 = -1

(-1)2 = 1

5

5-5=0

(0)2 = 0

7

7-5=2

(2)2 = 4

8

8-5=3

(3)2 = 9

9

9-5=4

(4)2 = 16

Mean=5

Sum=0

Sum=52
Properties of The Standard Deviation
22
 s measures spread about the mean and should be used only
when the mean is the measure of center.

 s = 0 only when all observations have the same value and there
is no spread. Otherwise, s > 0.
 s is not resistant to outliers.
 s has the same units of measurement as the original
observations.
Choosing Measures of Center and
Spread

23
We now have a choice between two descriptions for center and spread
 Mean and Standard Deviation
 Median and Interquartile Range
 The median and IQR are usually better than the mean and
standard deviation for describing a skewed distribution or a
distribution with outliers.

 Use mean and standard deviation only for reasonably symmetric
distributions that don’t have outliers.
NOTE: Numerical summaries do not fully describe the shape of a
distribution. ALWAYS PLOT YOUR DATA FIRST!
Changing the Unit of Measurement
24
 Variables can be recorded in different units of measurement.
 Most often, one measurement unit is a linear transformation of
another measurement unit: xnew = a + bx.
Example 1: If a distance x is measured in kilometers, the same distance
in miles is xnew = 0.62 x
This transformation changes the units without changing the origin
—a distance of 0 kilometers is the same as a distance of 0 miles.
Example 2: Temperatures can be expressed in degrees Fahrenheit or
degrees Celsius.
This transformation changes both the unit; size and the origin of
the measurements —The origin in the Celsius scale (0◦C, the
temperature at which water freezes) is 32◦ in the Fahrenheit scale.
Changing the Unit of Measurement
(Cont…)

25

 Linear transformations do not change the basic shape of a
distribution (skew, symmetry).
 But they do change the measures of center and spread:
 Multiplying each observation by a positive number b multiplies

both measures of center (mean, median) and spread (IQR, s) by b.
 Adding the same number a (positive or negative) to each

observation adds a to measures of center and to quartiles but it
does not change measures of spread (IQR, s).
1 de 25

Recomendados

Review & Hypothesis Testing por
Review & Hypothesis TestingReview & Hypothesis Testing
Review & Hypothesis TestingSr Edith Bogue
1.2K vistas40 diapositivas
Measure of dispersion por
Measure of dispersionMeasure of dispersion
Measure of dispersionAnil Pokhrel
235 vistas23 diapositivas
Chap07 interval estimation por
Chap07 interval estimationChap07 interval estimation
Chap07 interval estimationUni Azza Aunillah
9.3K vistas62 diapositivas
Estimation and confidence interval por
Estimation and confidence intervalEstimation and confidence interval
Estimation and confidence intervalHomework Guru
6.8K vistas65 diapositivas
Confidence Interval Estimation por
Confidence Interval EstimationConfidence Interval Estimation
Confidence Interval EstimationYesica Adicondro
2.7K vistas62 diapositivas
Measures of Variation or Dispersion por
Measures of Variation or Dispersion Measures of Variation or Dispersion
Measures of Variation or Dispersion Dr Athar Khan
796 vistas32 diapositivas

Más contenido relacionado

La actualidad más candente

Confidence interval & probability statements por
Confidence interval & probability statements Confidence interval & probability statements
Confidence interval & probability statements DrZahid Khan
3K vistas16 diapositivas
One Way ANOVA and Two Way ANOVA using R por
One Way ANOVA and Two Way ANOVA using ROne Way ANOVA and Two Way ANOVA using R
One Way ANOVA and Two Way ANOVA using RSean Stovall
1.4K vistas37 diapositivas
Inter quartile range por
Inter quartile rangeInter quartile range
Inter quartile rangeKen Plummer
5.1K vistas27 diapositivas
Statistics-Measures of dispersions por
Statistics-Measures of dispersionsStatistics-Measures of dispersions
Statistics-Measures of dispersionsCapricorn
10.8K vistas46 diapositivas
Statistics lecture 8 (chapter 7) por
Statistics lecture 8 (chapter 7)Statistics lecture 8 (chapter 7)
Statistics lecture 8 (chapter 7)jillmitchell8778
8.1K vistas75 diapositivas
T distribution | Statistics por
T distribution | StatisticsT distribution | Statistics
T distribution | StatisticsTransweb Global Inc
4.7K vistas8 diapositivas

La actualidad más candente(20)

Confidence interval & probability statements por DrZahid Khan
Confidence interval & probability statements Confidence interval & probability statements
Confidence interval & probability statements
DrZahid Khan3K vistas
One Way ANOVA and Two Way ANOVA using R por Sean Stovall
One Way ANOVA and Two Way ANOVA using ROne Way ANOVA and Two Way ANOVA using R
One Way ANOVA and Two Way ANOVA using R
Sean Stovall1.4K vistas
Inter quartile range por Ken Plummer
Inter quartile rangeInter quartile range
Inter quartile range
Ken Plummer5.1K vistas
Statistics-Measures of dispersions por Capricorn
Statistics-Measures of dispersionsStatistics-Measures of dispersions
Statistics-Measures of dispersions
Capricorn 10.8K vistas
Statistics lecture 8 (chapter 7) por jillmitchell8778
Statistics lecture 8 (chapter 7)Statistics lecture 8 (chapter 7)
Statistics lecture 8 (chapter 7)
jillmitchell87788.1K vistas
Measure of Central Tendency (Mean, Median, Mode and Quantiles) por Salman Khan
Measure of Central Tendency (Mean, Median, Mode and Quantiles)Measure of Central Tendency (Mean, Median, Mode and Quantiles)
Measure of Central Tendency (Mean, Median, Mode and Quantiles)
Salman Khan6.2K vistas
Descriptive statistics por Mmedsc Hahm
Descriptive statisticsDescriptive statistics
Descriptive statistics
Mmedsc Hahm301 vistas
Chapter 3 Confidence Interval por ghalan
Chapter 3 Confidence IntervalChapter 3 Confidence Interval
Chapter 3 Confidence Interval
ghalan7.5K vistas
Statistical Estimation and Testing Lecture Notes.pdf por Dr. Tushar J Bhatt
Statistical Estimation and Testing Lecture Notes.pdfStatistical Estimation and Testing Lecture Notes.pdf
Statistical Estimation and Testing Lecture Notes.pdf
Dr. Tushar J Bhatt755 vistas
Ppt for 1.1 introduction to statistical inference por vasu Chemistry
Ppt for 1.1 introduction to statistical inferencePpt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inference
vasu Chemistry1.1K vistas
Mean Deviation por Carlo Luna
Mean DeviationMean Deviation
Mean Deviation
Carlo Luna10.7K vistas
Estimation and hypothesis por Junaid Ijaz
Estimation and hypothesisEstimation and hypothesis
Estimation and hypothesis
Junaid Ijaz8.6K vistas

Destacado

FEC 512.04 por
FEC 512.04FEC 512.04
FEC 512.04Orhan Erdem
6.2K vistas54 diapositivas
F test Analysis of Variance (ANOVA) por
F test Analysis of Variance (ANOVA)F test Analysis of Variance (ANOVA)
F test Analysis of Variance (ANOVA)Marianne Maluyo
10.6K vistas55 diapositivas
Chapter 6 part2-Introduction to Inference-Tests of Significance, Stating Hyp... por
Chapter 6 part2-Introduction to Inference-Tests of Significance,  Stating Hyp...Chapter 6 part2-Introduction to Inference-Tests of Significance,  Stating Hyp...
Chapter 6 part2-Introduction to Inference-Tests of Significance, Stating Hyp...nszakir
9.7K vistas19 diapositivas
Estimation por
EstimationEstimation
Estimationrishi.indian
5.4K vistas24 diapositivas
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd... por
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...nszakir
6.4K vistas19 diapositivas
Anova (f test) and mean differentiation por
Anova (f test) and mean differentiationAnova (f test) and mean differentiation
Anova (f test) and mean differentiationSubramani Parasuraman
8.5K vistas27 diapositivas

Destacado(12)

F test Analysis of Variance (ANOVA) por Marianne Maluyo
F test Analysis of Variance (ANOVA)F test Analysis of Variance (ANOVA)
F test Analysis of Variance (ANOVA)
Marianne Maluyo10.6K vistas
Chapter 6 part2-Introduction to Inference-Tests of Significance, Stating Hyp... por nszakir
Chapter 6 part2-Introduction to Inference-Tests of Significance,  Stating Hyp...Chapter 6 part2-Introduction to Inference-Tests of Significance,  Stating Hyp...
Chapter 6 part2-Introduction to Inference-Tests of Significance, Stating Hyp...
nszakir9.7K vistas
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd... por nszakir
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...
nszakir6.4K vistas
Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con... por nszakir
Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...
Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...
nszakir6.1K vistas
Theory of estimation por Tech_MX
Theory of estimationTheory of estimation
Theory of estimation
Tech_MX45.1K vistas
Hypothesis Testing por Harish Lunani
Hypothesis TestingHypothesis Testing
Hypothesis Testing
Harish Lunani216.7K vistas
Hypothesis testing; z test, t-test. f-test por Shakehand with Life
Hypothesis testing; z test, t-test. f-testHypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-test
Shakehand with Life191.3K vistas
Chi square test por Patel Parth
Chi square testChi square test
Chi square test
Patel Parth376.8K vistas

Similar a Describing Distributions with Numbers

3. Descriptive statistics.pdf por
3. Descriptive statistics.pdf3. Descriptive statistics.pdf
3. Descriptive statistics.pdfYomifDeksisaHerpa
3 vistas55 diapositivas
local_media4419196206087945469 (1).pptx por
local_media4419196206087945469 (1).pptxlocal_media4419196206087945469 (1).pptx
local_media4419196206087945469 (1).pptxJayArRodriguez2
4 vistas67 diapositivas
Measures of Dispersion.pptx por
Measures of Dispersion.pptxMeasures of Dispersion.pptx
Measures of Dispersion.pptxVanmala Buchke
42 vistas51 diapositivas
Penggambaran Data Secara Numerik por
Penggambaran Data Secara NumerikPenggambaran Data Secara Numerik
Penggambaran Data Secara Numerikanom1392
638 vistas48 diapositivas
Empirics of standard deviation por
Empirics of standard deviationEmpirics of standard deviation
Empirics of standard deviationAdebanji Ayeni
449 vistas7 diapositivas
Describing quantitative data with numbers por
Describing quantitative data with numbersDescribing quantitative data with numbers
Describing quantitative data with numbersUlster BOCES
1.7K vistas34 diapositivas

Similar a Describing Distributions with Numbers(20)

local_media4419196206087945469 (1).pptx por JayArRodriguez2
local_media4419196206087945469 (1).pptxlocal_media4419196206087945469 (1).pptx
local_media4419196206087945469 (1).pptx
JayArRodriguez24 vistas
Penggambaran Data Secara Numerik por anom1392
Penggambaran Data Secara NumerikPenggambaran Data Secara Numerik
Penggambaran Data Secara Numerik
anom1392638 vistas
Empirics of standard deviation por Adebanji Ayeni
Empirics of standard deviationEmpirics of standard deviation
Empirics of standard deviation
Adebanji Ayeni449 vistas
Describing quantitative data with numbers por Ulster BOCES
Describing quantitative data with numbersDescribing quantitative data with numbers
Describing quantitative data with numbers
Ulster BOCES1.7K vistas
ap_stat_1.3.ppt por fghgjd
ap_stat_1.3.pptap_stat_1.3.ppt
ap_stat_1.3.ppt
fghgjd14 vistas
Measures of Central Tendency, Variability and Shapes por ScholarsPoint1
Measures of Central Tendency, Variability and ShapesMeasures of Central Tendency, Variability and Shapes
Measures of Central Tendency, Variability and Shapes
ScholarsPoint1368 vistas
Measures of dispersion por DrZahid Khan
Measures of dispersionMeasures of dispersion
Measures of dispersion
DrZahid Khan35.4K vistas
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal ) por Neeraj Bhandari
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )
Neeraj Bhandari4.1K vistas
Module 3 statistics por dionesioable
Module 3   statisticsModule 3   statistics
Module 3 statistics
dionesioable14.5K vistas
Kwoledge of calculation of mean,median and mode por Aarti Vijaykumar
Kwoledge of calculation of mean,median and modeKwoledge of calculation of mean,median and mode
Kwoledge of calculation of mean,median and mode
Aarti Vijaykumar6K vistas
measure of dispersion por som allul
measure of dispersion measure of dispersion
measure of dispersion
som allul3.8K vistas
Mba i qt unit-2.1_measures of variations por Rai University
Mba i qt unit-2.1_measures of variationsMba i qt unit-2.1_measures of variations
Mba i qt unit-2.1_measures of variations
Rai University6K vistas
Central tendency _dispersion por Kirti Gupta
Central tendency _dispersionCentral tendency _dispersion
Central tendency _dispersion
Kirti Gupta483 vistas
3Measurements of health and disease_MCTD.pdf por AmanuelDina
3Measurements of health and disease_MCTD.pdf3Measurements of health and disease_MCTD.pdf
3Measurements of health and disease_MCTD.pdf
AmanuelDina9 vistas

Más de nszakir

Chapter-4: More on Direct Proof and Proof by Contrapositive por
Chapter-4: More on Direct Proof and Proof by ContrapositiveChapter-4: More on Direct Proof and Proof by Contrapositive
Chapter-4: More on Direct Proof and Proof by Contrapositivenszakir
4.8K vistas26 diapositivas
Chapter-3: DIRECT PROOF AND PROOF BY CONTRAPOSITIVE por
Chapter-3: DIRECT PROOF AND PROOF BY CONTRAPOSITIVEChapter-3: DIRECT PROOF AND PROOF BY CONTRAPOSITIVE
Chapter-3: DIRECT PROOF AND PROOF BY CONTRAPOSITIVEnszakir
14.1K vistas28 diapositivas
Chapter 2: Relations por
Chapter 2: RelationsChapter 2: Relations
Chapter 2: Relationsnszakir
24K vistas25 diapositivas
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ... por
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...nszakir
4.6K vistas14 diapositivas
Chapter 5 part1- The Sampling Distribution of a Sample Mean por
Chapter 5 part1- The Sampling Distribution of a Sample MeanChapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 5 part1- The Sampling Distribution of a Sample Meannszakir
9.9K vistas12 diapositivas
Chapter 4 part4- General Probability Rules por
Chapter 4 part4- General Probability RulesChapter 4 part4- General Probability Rules
Chapter 4 part4- General Probability Rulesnszakir
6.9K vistas14 diapositivas

Más de nszakir(17)

Chapter-4: More on Direct Proof and Proof by Contrapositive por nszakir
Chapter-4: More on Direct Proof and Proof by ContrapositiveChapter-4: More on Direct Proof and Proof by Contrapositive
Chapter-4: More on Direct Proof and Proof by Contrapositive
nszakir4.8K vistas
Chapter-3: DIRECT PROOF AND PROOF BY CONTRAPOSITIVE por nszakir
Chapter-3: DIRECT PROOF AND PROOF BY CONTRAPOSITIVEChapter-3: DIRECT PROOF AND PROOF BY CONTRAPOSITIVE
Chapter-3: DIRECT PROOF AND PROOF BY CONTRAPOSITIVE
nszakir14.1K vistas
Chapter 2: Relations por nszakir
Chapter 2: RelationsChapter 2: Relations
Chapter 2: Relations
nszakir24K vistas
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ... por nszakir
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...
nszakir4.6K vistas
Chapter 5 part1- The Sampling Distribution of a Sample Mean por nszakir
Chapter 5 part1- The Sampling Distribution of a Sample MeanChapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 5 part1- The Sampling Distribution of a Sample Mean
nszakir9.9K vistas
Chapter 4 part4- General Probability Rules por nszakir
Chapter 4 part4- General Probability RulesChapter 4 part4- General Probability Rules
Chapter 4 part4- General Probability Rules
nszakir6.9K vistas
Chapter 4 part3- Means and Variances of Random Variables por nszakir
Chapter 4 part3- Means and Variances of Random VariablesChapter 4 part3- Means and Variances of Random Variables
Chapter 4 part3- Means and Variances of Random Variables
nszakir9.1K vistas
Chapter 4 part2- Random Variables por nszakir
Chapter 4 part2- Random VariablesChapter 4 part2- Random Variables
Chapter 4 part2- Random Variables
nszakir6.8K vistas
Chapter 4 part1-Probability Model por nszakir
Chapter 4 part1-Probability ModelChapter 4 part1-Probability Model
Chapter 4 part1-Probability Model
nszakir4K vistas
Chapter 3 part3-Toward Statistical Inference por nszakir
Chapter 3 part3-Toward Statistical InferenceChapter 3 part3-Toward Statistical Inference
Chapter 3 part3-Toward Statistical Inference
nszakir2.3K vistas
Chapter 3 part2- Sampling Design por nszakir
Chapter 3 part2- Sampling DesignChapter 3 part2- Sampling Design
Chapter 3 part2- Sampling Design
nszakir1.8K vistas
Chapter 3 part1-Design of Experiments por nszakir
Chapter 3 part1-Design of ExperimentsChapter 3 part1-Design of Experiments
Chapter 3 part1-Design of Experiments
nszakir3K vistas
Chapter 2 part2-Correlation por nszakir
Chapter 2 part2-CorrelationChapter 2 part2-Correlation
Chapter 2 part2-Correlation
nszakir1.1K vistas
Chapter 2 part1-Scatterplots por nszakir
Chapter 2 part1-ScatterplotsChapter 2 part1-Scatterplots
Chapter 2 part1-Scatterplots
nszakir1.1K vistas
Chapter 2 part3-Least-Squares Regression por nszakir
Chapter 2 part3-Least-Squares RegressionChapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares Regression
nszakir5.4K vistas
Density Curves and Normal Distributions por nszakir
Density Curves and Normal DistributionsDensity Curves and Normal Distributions
Density Curves and Normal Distributions
nszakir3K vistas
Displaying Distributions with Graphs por nszakir
Displaying Distributions with GraphsDisplaying Distributions with Graphs
Displaying Distributions with Graphs
nszakir2.4K vistas

Último

Classification of crude drugs.pptx por
Classification of crude drugs.pptxClassification of crude drugs.pptx
Classification of crude drugs.pptxGayatriPatra14
77 vistas13 diapositivas
Computer Introduction-Lecture06 por
Computer Introduction-Lecture06Computer Introduction-Lecture06
Computer Introduction-Lecture06Dr. Mazin Mohamed alkathiri
71 vistas12 diapositivas
Material del tarjetero LEES Travesías.docx por
Material del tarjetero LEES Travesías.docxMaterial del tarjetero LEES Travesías.docx
Material del tarjetero LEES Travesías.docxNorberto Millán Muñoz
68 vistas9 diapositivas
The basics - information, data, technology and systems.pdf por
The basics - information, data, technology and systems.pdfThe basics - information, data, technology and systems.pdf
The basics - information, data, technology and systems.pdfJonathanCovena1
88 vistas1 diapositiva
ACTIVITY BOOK key water sports.pptx por
ACTIVITY BOOK key water sports.pptxACTIVITY BOOK key water sports.pptx
ACTIVITY BOOK key water sports.pptxMar Caston Palacio
430 vistas4 diapositivas
Psychology KS5 por
Psychology KS5Psychology KS5
Psychology KS5WestHatch
77 vistas5 diapositivas

Último(20)

Classification of crude drugs.pptx por GayatriPatra14
Classification of crude drugs.pptxClassification of crude drugs.pptx
Classification of crude drugs.pptx
GayatriPatra1477 vistas
The basics - information, data, technology and systems.pdf por JonathanCovena1
The basics - information, data, technology and systems.pdfThe basics - information, data, technology and systems.pdf
The basics - information, data, technology and systems.pdf
JonathanCovena188 vistas
Psychology KS5 por WestHatch
Psychology KS5Psychology KS5
Psychology KS5
WestHatch77 vistas
11.28.23 Social Capital and Social Exclusion.pptx por mary850239
11.28.23 Social Capital and Social Exclusion.pptx11.28.23 Social Capital and Social Exclusion.pptx
11.28.23 Social Capital and Social Exclusion.pptx
mary850239281 vistas
Class 10 English lesson plans por TARIQ KHAN
Class 10 English  lesson plansClass 10 English  lesson plans
Class 10 English lesson plans
TARIQ KHAN257 vistas
ISO/IEC 27001 and ISO/IEC 27005: Managing AI Risks Effectively por PECB
ISO/IEC 27001 and ISO/IEC 27005: Managing AI Risks EffectivelyISO/IEC 27001 and ISO/IEC 27005: Managing AI Risks Effectively
ISO/IEC 27001 and ISO/IEC 27005: Managing AI Risks Effectively
PECB 545 vistas
AI Tools for Business and Startups por Svetlin Nakov
AI Tools for Business and StartupsAI Tools for Business and Startups
AI Tools for Business and Startups
Svetlin Nakov101 vistas
Psychology KS4 por WestHatch
Psychology KS4Psychology KS4
Psychology KS4
WestHatch68 vistas
OEB 2023 Co-learning To Speed Up AI Implementation in Courses.pptx por Inge de Waard
OEB 2023 Co-learning To Speed Up AI Implementation in Courses.pptxOEB 2023 Co-learning To Speed Up AI Implementation in Courses.pptx
OEB 2023 Co-learning To Speed Up AI Implementation in Courses.pptx
Inge de Waard167 vistas
Ch. 7 Political Participation and Elections.pptx por Rommel Regala
Ch. 7 Political Participation and Elections.pptxCh. 7 Political Participation and Elections.pptx
Ch. 7 Political Participation and Elections.pptx
Rommel Regala72 vistas

Describing Distributions with Numbers

  • 1. 1 INTRODUCTION TO STATISTICS & PROBABILITY Chapter 1: Looking at Data—Distributions (Part 2) 1.2 Describing Distributions with Numbers Dr. Nahid Sultana
  • 2. 1.2 Describing Distributions with Numbers 2 Objectives  Measures of center: mean, median  Measures of spread: quartiles, standard deviation  Five-number summary and boxplot  IQR and outliers  Choosing among summary statistics  Changing the unit of measurement
  • 3. Measures of center: The Mean 3  The most common measure of center is the arithmetic average, or mean, or sample mean.  To calculate the average, or mean, add all values, then divide by the number of individuals.  It is the “center of mass.”  If the n observations are x1, x2, x3, …, xn, their mean is: sum of observations x1  x2  ...  xn x  n n 1 or in more compact notation, x  n  xi
  • 4. Measures of center: The Mean (cont…) 4 Find the mean: Here are the scores on the first exam in an introductory statistics course for 10 students: 80 73 92 85 75 98 93 55 Find the mean first-exam score for these students. Solution: 80 90
  • 5. Measuring Center: The Median 5  Another common measure of center is the median.  The median M is the midpoint of a distribution, the number such that half of the observations are smaller and the other half are larger. To find the median of a distribution: 1. Arrange all observations from smallest to largest. 2. If the number of observations n is odd, the median M is the center observation in the ordered list. 3. If the number of observations n is even, the median M is the average of the two center observations in the ordered list.
  • 6. Measuring Center: The Median (cont...) 6 Find the median: Here are the scores on the first exam in an introductory statistics course for 10 students: 80 73 92 85 75 98 93 55 80 Find the median first-exam score for these students. Solution: 90 Note: The location of the median is (n + 1)/2 in the sorted list.
  • 8. Comparing Mean and Median (Cont...) 8  The mean and the median are the same only if the distribution is symmetrical.  In a skewed distribution, the mean is usually farther out in the long tail than is the median.  The median is a measure of center that is resistant to skew and outliers. The mean is not.
  • 9. Measuring Spread: The Quartiles 9 A measure of center alone can be misleading. A useful numerical description of a distribution requires both a measure of center and a measure of spread.  We describe the spread or variability of a distribution by giving several percentiles.  The median divides the data in two parts; half of the observations are above the median and half are below the median. We could call the median the 50th percentile.  The lower quartile (first quartile, Q1)is the median of the lower half of the data; the upper quartile (third quartile, Q3) is the median of the upper half of the data.  With the median, the quartiles divide the data into four equal parts; 25% of the data are in each part
  • 10. Measuring Spread: The Quartiles (Cont.) Calculate the quartiles and inter-quartile: 10 1. Arrange the observations in increasing order and locate the median M. 2. The first quartile Q1 is the median of the lower half of the data, excluding M. 3. The third quartile Q3 is it is the median of the upper half of the data, excluding M.
  • 11. Measuring Spread: The Quartiles (Cont.) 11 Example: Here are the scores on the first-exam in an introductory statistics course for 10 students: 80 73 92 85 75 98 93 55 80 90 Find the quartiles for these first-exam scores. Solution: In order, the scores are: 55 73 75 80 80 85 90 92 93 98 The median is, Q1 = 75, the median of the first five numbers: 55, 73, 75, 80, 80. Q3 = 92, the median of the last five numbers: 85, 90, 92, 93, 98.
  • 12. The Five-Number Summary 12 The five-number summary of a distribution consists of  The smallest observation (Min)  The first quartile (Q1)  The median (M)  The third quartile (Q3)  The largest observation (Max) written in order from smallest to largest. Minimum Q1 M Q3 Maximum
  • 13. Boxplots 13 A boxplot is a graph of the five-number summary.  Draw a central box from Q1 to Q3.  Draw a line inside the box to mark the median M.  Extend lines from the box out to the minimum and maximum values that are not outliers.
  • 14. Boxplots (Cont…) 14 Example: Here are the scores on the first-exam in an introductory statistics course for 10 students: 80 73 92 85 75 98 93 Make a boxplot for these first-exam scores. Solution: In order, the scores are: 55, 73, 75, 80, 80, 85, 90, 92, 93, 98 Min = 55 Q1 = 75 M = 82.5 Q3 = 92 Max = 98 55 80 90
  • 15. Comparing Boxplots to Histograms 15 15
  • 16. Boxplots and skewed data 16 Years until death Boxplots for a symmetric and a right-skewed distribution 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Boxplots show symmetry or skew. Disease X Multiple Myeloma
  • 17. Suspected Outliers: 1.5  IQR Rule 17  Outliers are troublesome data points, and it is important to be able to identify them. The interquartile range IQR is the distance between the first and third quartiles, IQR = Q3 − Q1  IQR is used as part of a rule of thumb for identifying outliers. The 1.5  IQR Rule for Outliers Call an observation an outlier if it falls more than 1.5  IQR above the third quartile or below the first quartile.  Suspected low outlier: any value < Q1 – 1.5  IQR  Suspected high outlier: any value > Q3 + 1.5  IQR
  • 18. Suspected Outliers: 1.5  IQR Rule (Cont..) 18 Individual #25 has a value of 7.9 years, which is 3.55 years above the third quartile. This is more than 1.5 * IQR =3.225 years. Thus, individual #25 is a suspected outlier.
  • 19. Suspected Outliers: 1.5  IQR Rule (Cont..) 19  Modified boxplots plot suspected outliers individually.  The 8 largest call lengths are 438, 465, 479, 700, 700, 951, 1148, 2631  They are plotted as individual points, though 2 of them are identical and so do not appear separately.
  • 20. Measuring Spread: The Standard Deviation 20 The most common measure of spread looks at how far each observation is from the mean. This measure is called the standard deviation.  The standard deviation s measures the average distance of the observations from their mean.  It is calculated by  This average squared distance is called the variance.
  • 21. Calculating The Standard Deviation 21 1. Calculate mean 2. Calculate each deviation, deviation = observation – mean 3. Square each deviation 4. Calculate the sum of the squared deviations 5. Divided by degrees freedom, (df) = (n-1), this is called the variance. 6. Calculate the square root of the variance…this is the standard deviation. The variance = 52/(9 – 1) = 6.5 Standard deviation = 6.5 = 2.55 xi (xi-mean) (xi-mean)2 1 1 - 5 = -4 (-4)2 = 16 3 3 - 5 = -2 (-2)2 = 4 4 4 - 5 = -1 (-1)2 = 1 4 4 - 5 = -1 (-1)2 = 1 4 4 - 5 = -1 (-1)2 = 1 5 5-5=0 (0)2 = 0 7 7-5=2 (2)2 = 4 8 8-5=3 (3)2 = 9 9 9-5=4 (4)2 = 16 Mean=5 Sum=0 Sum=52
  • 22. Properties of The Standard Deviation 22  s measures spread about the mean and should be used only when the mean is the measure of center.  s = 0 only when all observations have the same value and there is no spread. Otherwise, s > 0.  s is not resistant to outliers.  s has the same units of measurement as the original observations.
  • 23. Choosing Measures of Center and Spread 23 We now have a choice between two descriptions for center and spread  Mean and Standard Deviation  Median and Interquartile Range  The median and IQR are usually better than the mean and standard deviation for describing a skewed distribution or a distribution with outliers.  Use mean and standard deviation only for reasonably symmetric distributions that don’t have outliers. NOTE: Numerical summaries do not fully describe the shape of a distribution. ALWAYS PLOT YOUR DATA FIRST!
  • 24. Changing the Unit of Measurement 24  Variables can be recorded in different units of measurement.  Most often, one measurement unit is a linear transformation of another measurement unit: xnew = a + bx. Example 1: If a distance x is measured in kilometers, the same distance in miles is xnew = 0.62 x This transformation changes the units without changing the origin —a distance of 0 kilometers is the same as a distance of 0 miles. Example 2: Temperatures can be expressed in degrees Fahrenheit or degrees Celsius. This transformation changes both the unit; size and the origin of the measurements —The origin in the Celsius scale (0◦C, the temperature at which water freezes) is 32◦ in the Fahrenheit scale.
  • 25. Changing the Unit of Measurement (Cont…) 25  Linear transformations do not change the basic shape of a distribution (skew, symmetry).  But they do change the measures of center and spread:  Multiplying each observation by a positive number b multiplies both measures of center (mean, median) and spread (IQR, s) by b.  Adding the same number a (positive or negative) to each observation adds a to measures of center and to quartiles but it does not change measures of spread (IQR, s).