SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
Descriptive Statistics
Krupnik Estate Agents
Anthony J. Evans
Professor of Economics, ESCP Europe
www.anthonyjevans.com
(cc) Anthony J. Evans 2019 | http://creativecommons.org/licenses/by-nc-sa/3.0/
Weekly prices of studio apartments in West Hampstead
2
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460480500
550
600
440
450
465
480
500
570
615
440
450465
480
510
570
615
Weekly prices of studio apartments in West Hampstead (2)
• The first task is to order the data set
3
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Foundations of descriptive statistics
Measures of data
Location
Dispersion
Description
• A “typical” or average value
• Used to summarise the distribution
• The spread or variability of the data
• Appreciate the differences in the data
4
Measures of Location: Mean
• The mean of a data set is the summation of all individual
values, divided by the number of observations
SamplePopulation
5Notice the use of Greek/upper case for populations and Latin/lower case for samples
x =
xi∑
n
=
34,356
70
= 490.80
€
µ =
x∑ i
N
€
x =
x∑ i
n
Measures of Location: Median
• The median of a data set is the value that divides the lower half of
the distribution from the higher half
• The median is the middle observation
– i.e. the (n+1)/2th observation
– In this case, 71/2 = 35.5th observation
• If there are an even number of observations, take the mean of
both middle values
• Median = 475
6
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Measures of Location: Mode
• The mode of a data set is the value that occurs with the
greatest frequency.
• If the data have exactly two modes, the data are bimodal
• If the data have more than two modes, the data are
multimodal
• Mode = 450
7
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Measures of Location: Using Excel
8
Measures of Location: Percentiles
• The median is also known as the 50th percentile
• The p th percentile of a data set is a value such
that p percent of the items take on this value or
less, and (100 - p) percent of the items take on this
value or more
– Arrange the data of ‘n’ items in ascending order
– Compute index i, the position of the pth
percentile
– If i is not an integer, round up. The pth
percentile is the value in the ith position.
– If i is an integer, the pth percentile is the
average of the values in positions i and i+1
• I is the position of the p percentile
9
€
i =
p
100
"
#
$
%
&
'n
Measures of Location: 90th Percentile
• i = (p/100)n = (90/100)*70 = 63
• It will be the 63rd value
• Averaging the 63rd and 64th data values
• 90th Percentile = 585
10
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Measures of Location: 25th Percentile
• i = (p/100)n = (25/100)*70 = 17.5
• It will be the 18th value
• 25th Percentile = 445
11
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
GDP of UN member countries
12
$267,015
$18,171
0
50,000
100,000
150,000
200,000
250,000
300,000
Mean Median
GDP
On average, how many legs do English people have?
Measures of Dispersion
• in choosing supplier A or supplier B we should consider not
only the average delivery time for each, but also the
variability in delivery time for each
A
Mean = 0
B
Mean = 0
Frequency
13
Measures of Dispersion: Range
• The range of a data set is the difference between the
largest and smallest data values
• Range = largest value - smallest value
• Range = 615 - 425 = 190
14
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Measures of Dispersion: Interquartile range
• The interquartile range of a data set is the difference
between the first and third quartiles
• Q1 = 25th percentile = 445 (from before)
• Q3 = 75th percentile = 525
• Interquartile range = 525 - 445 = 80
15
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
€
i =
p
100
"
#
$
%
&
'n
Measures of Dispersion: Variance
• The variance is a measure of variability that utilizes all the
data.
• It is based on the difference between the value of each
observation (xi) and the mean (x BAR for a sample, and m
for a population)
• The variance is the average of the squared differences
between each data value and the mean.
s
2
=
(xi−x )2
∑
n−1
σ2
=
(xi −µ)2
∑
N
16
SamplePopulation
Measures of Dispersion: Standard Deviation
• The standard deviation of a data set is the square root of
the variance.
• It is more easily comparable to the mean than the variance
– standard deviation measures the spread about the mean
using the original (not squared) scale
• It ties into the Normal Distribution
€
s = s2
€
σ = σ2
Sample Population
17
s =
(xi −x )2
∑
n−1
What is the standard deviation?
18
i xi xi - x' (xi - x')2
1 425 -65.8 4329.64
2 440 -50.8 2580.64
3 450 -40.8 1664.64
4 465 -25.8 665.64
5 480 -10.8 116.64
6 510 19.2 368.64
7 575 84.2 7089.64
8 430 -60.8 3696.64
9 440 -50.8 2580.64
10 450 -40.8 1664.64
65 450 -40.8 1664.64
66 465 -25.8 665.64
67 480 -10.8 116.64
68 510 19.2 368.64
69 570 79.2 6272.64
70 615 124.2 15425.64
sum 206,735.20
/(n-1) 2,996.16
sqrt 54.74
Measures of Dispersion: Examples
• Variance
• Standard Deviation
s2
=
(xi −x )2
∑
n−1
=2,996
74.5429962
=== ss
19
Measures of Dispersion: z Score
• The z - score is the standardised value
• It denotes the number of standard deviations a
data value xi is from the mean
• A data value less than the sample mean will
have a z-score less than zero
• A data value greater than the sample mean
will have a z -score greater than zero
€
zi =
xi − x
s
20
Measures of Dispersion: z Score for smallest value
• z-Score of Smallest Value (425)
• Standardized Values for Apartment Rents:
z
x x
s
i=
-
=
-
= -
425 490 80
54 74
1 20
.
.
.
21
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Measures of Dispersion: Outliers
• An outlier is an unusually small or unusually large value in
a data set
– It might be an incorrectly recorded data value
– It might be a data value that was incorrectly included
in the data set
– It might be a correctly recorded data value that
belongs in the data set!
• A data value with a z-score less than -3 or greater than +3
might be considered an outlier
22
Summary
• There are two main ways to get feel for a set of numbers (a
distribution) – location and dispersion
• The mean and the standard deviation are the most
frequent measures of location and dispersion but it’s
important to understand the alternatives
23
Solutions
24
Measures of Location: Mean
• The mean of a data set is the summation of all individual
values, divided by the number of observations
SamplePopulation
25Notice the use of Greek/upper case for populations and Latin/lower case for samples
x =
xi∑
n
=
34,356
70
= 490.80
€
µ =
x∑ i
N
€
x =
x∑ i
n
Measures of Location: Median
• The median of a data set is the value that divides the lower half of
the distribution from the higher half
• The median is the middle observation
– i.e. the (n+1)/2th observation
– In this case, 71/2 = 35.5th observation
• If there are an even number of observations, take the mean of
both middle values
• Median = 475
26
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Measures of Location: Mode
• The mode of a data set is the value that occurs with the
greatest frequency.
• If the data have exactly two modes, the data are bimodal
• If the data have more than two modes, the data are
multimodal
• Mode = 450
27
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Measures of Location: 90th Percentile
• i = (p/100)n = (90/100)*70 = 63
• It will be the 63rd value
• Averaging the 63rd and 64th data values
• 90th Percentile = 585
28
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Measures of Location: 25th Percentile
• i = (p/100)n = (25/100)*70 = 17.5
• It will be the 18th value
• 25th Percentile = 445
29
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Measures of Dispersion: Range
• The range of a data set is the difference between the
largest and smallest data values
• Range = largest value - smallest value
• Range = 615 - 425 = 190
30
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Measures of Dispersion: Interquartile range
• The interquartile range of a data set is the difference
between the first and third quartiles
• Q1 = 25th percentile = 445 (from before)
• Q3 = 75th percentile = 525
• Interquartile range = 525 - 445 = 80
31
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
€
i = (
p
100
)n
Measures of Dispersion: z Score for smallest value
• z-Score of Smallest Value (425)
• Standardized Values for Apartment Rents:
z
x x
s
i=
-
=
-
= -
425 490 80
54 74
1 20
.
.
.
32
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
• This presentation forms part of a free, online course
on analytics
• http://econ.anthonyjevans.com/courses/analytics/
33

Más contenido relacionado

Similar a Descriptive Statistics

슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)
슬로우캠퍼스:  scikit-learn & 머신러닝 (강박사)슬로우캠퍼스:  scikit-learn & 머신러닝 (강박사)
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)마이캠퍼스
 
Data_Analysis_SPSS_Technique.ppt
Data_Analysis_SPSS_Technique.pptData_Analysis_SPSS_Technique.ppt
Data_Analysis_SPSS_Technique.pptRiyadhJack
 
Measure of Variability Report.pptx
Measure of Variability Report.pptxMeasure of Variability Report.pptx
Measure of Variability Report.pptxCalvinAdorDionisio
 
Measures of variability
Measures of variabilityMeasures of variability
Measures of variabilityJed Abolencia
 
Measure of central tendency
Measure of central tendency Measure of central tendency
Measure of central tendency Kannan Iyanar
 
Revisionf2
Revisionf2Revisionf2
Revisionf2wind12
 
Automatic Visualization
Automatic VisualizationAutomatic Visualization
Automatic VisualizationSri Ambati
 
Ee184405 statistika dan stokastik statistik deskriptif 2 numerik
Ee184405 statistika dan stokastik   statistik deskriptif 2 numerikEe184405 statistika dan stokastik   statistik deskriptif 2 numerik
Ee184405 statistika dan stokastik statistik deskriptif 2 numerikyusufbf
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersionMayuri Joshi
 
3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplots3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplotsLong Beach City College
 
Penggambaran Data Secara Numerik
Penggambaran Data Secara NumerikPenggambaran Data Secara Numerik
Penggambaran Data Secara Numerikanom1392
 

Similar a Descriptive Statistics (20)

Multivariate analysis
Multivariate analysisMultivariate analysis
Multivariate analysis
 
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)
슬로우캠퍼스:  scikit-learn & 머신러닝 (강박사)슬로우캠퍼스:  scikit-learn & 머신러닝 (강박사)
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)
 
Multivariate Analysis
Multivariate AnalysisMultivariate Analysis
Multivariate Analysis
 
Data_Analysis_SPSS_Technique.ppt
Data_Analysis_SPSS_Technique.pptData_Analysis_SPSS_Technique.ppt
Data_Analysis_SPSS_Technique.ppt
 
Measure of Variability Report.pptx
Measure of Variability Report.pptxMeasure of Variability Report.pptx
Measure of Variability Report.pptx
 
Measures of variability
Measures of variabilityMeasures of variability
Measures of variability
 
Session 3&4.pptx
Session 3&4.pptxSession 3&4.pptx
Session 3&4.pptx
 
Measure of central tendency
Measure of central tendency Measure of central tendency
Measure of central tendency
 
SP and R.pptx
SP and R.pptxSP and R.pptx
SP and R.pptx
 
Revisionf2
Revisionf2Revisionf2
Revisionf2
 
Automatic Visualization
Automatic VisualizationAutomatic Visualization
Automatic Visualization
 
Ee184405 statistika dan stokastik statistik deskriptif 2 numerik
Ee184405 statistika dan stokastik   statistik deskriptif 2 numerikEe184405 statistika dan stokastik   statistik deskriptif 2 numerik
Ee184405 statistika dan stokastik statistik deskriptif 2 numerik
 
Measures of Dispersion.pptx
Measures of Dispersion.pptxMeasures of Dispersion.pptx
Measures of Dispersion.pptx
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
 
Normal distri
Normal distriNormal distri
Normal distri
 
statics in research
statics in researchstatics in research
statics in research
 
3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplots3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplots
 
Descriptive Analysis.pptx
Descriptive Analysis.pptxDescriptive Analysis.pptx
Descriptive Analysis.pptx
 
Penggambaran Data Secara Numerik
Penggambaran Data Secara NumerikPenggambaran Data Secara Numerik
Penggambaran Data Secara Numerik
 
lecture_4.pptx
lecture_4.pptxlecture_4.pptx
lecture_4.pptx
 

Más de Anthony J. Evans (16)

Time Series
Time SeriesTime Series
Time Series
 
The Suitcase Case
The Suitcase CaseThe Suitcase Case
The Suitcase Case
 
Correlation
Correlation Correlation
Correlation
 
Nonparametric Statistics
Nonparametric StatisticsNonparametric Statistics
Nonparametric Statistics
 
Student's T Test
Student's T TestStudent's T Test
Student's T Test
 
Significance Tests
Significance TestsSignificance Tests
Significance Tests
 
Taxi for Professor Evans
Taxi for Professor EvansTaxi for Professor Evans
Taxi for Professor Evans
 
Inferential Statistics
Inferential StatisticsInferential Statistics
Inferential Statistics
 
Probability Distributions
Probability Distributions Probability Distributions
Probability Distributions
 
Probability Theory
Probability Theory Probability Theory
Probability Theory
 
Statistical Literacy
Statistical Literacy Statistical Literacy
Statistical Literacy
 
Quantitative Methods
Quantitative Methods Quantitative Methods
Quantitative Methods
 
Collecting and Presenting Data
Collecting and Presenting DataCollecting and Presenting Data
Collecting and Presenting Data
 
Numeracy Skills 1
Numeracy Skills 1Numeracy Skills 1
Numeracy Skills 1
 
The Dynamic AD AS Model
The Dynamic AD AS ModelThe Dynamic AD AS Model
The Dynamic AD AS Model
 
Numeracy Skills 2
Numeracy Skills 2Numeracy Skills 2
Numeracy Skills 2
 

Último

FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 

Último (20)

FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 

Descriptive Statistics

  • 1. Descriptive Statistics Krupnik Estate Agents Anthony J. Evans Professor of Economics, ESCP Europe www.anthonyjevans.com (cc) Anthony J. Evans 2019 | http://creativecommons.org/licenses/by-nc-sa/3.0/
  • 2. Weekly prices of studio apartments in West Hampstead 2 425 440 450 465 480 510 575 430 440 450 470 485 515 575 430 440 450 470 490 525 580 435 445 450 472 490 525 590 435 445 450 475 490 525 600 435 445 460 475 500 535 600 435 445 460 475 500 549 600 435 445 460480500 550 600 440 450 465 480 500 570 615 440 450465 480 510 570 615
  • 3. Weekly prices of studio apartments in West Hampstead (2) • The first task is to order the data set 3 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615
  • 4. Foundations of descriptive statistics Measures of data Location Dispersion Description • A “typical” or average value • Used to summarise the distribution • The spread or variability of the data • Appreciate the differences in the data 4
  • 5. Measures of Location: Mean • The mean of a data set is the summation of all individual values, divided by the number of observations SamplePopulation 5Notice the use of Greek/upper case for populations and Latin/lower case for samples x = xi∑ n = 34,356 70 = 490.80 € µ = x∑ i N € x = x∑ i n
  • 6. Measures of Location: Median • The median of a data set is the value that divides the lower half of the distribution from the higher half • The median is the middle observation – i.e. the (n+1)/2th observation – In this case, 71/2 = 35.5th observation • If there are an even number of observations, take the mean of both middle values • Median = 475 6 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615
  • 7. Measures of Location: Mode • The mode of a data set is the value that occurs with the greatest frequency. • If the data have exactly two modes, the data are bimodal • If the data have more than two modes, the data are multimodal • Mode = 450 7 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615
  • 8. Measures of Location: Using Excel 8
  • 9. Measures of Location: Percentiles • The median is also known as the 50th percentile • The p th percentile of a data set is a value such that p percent of the items take on this value or less, and (100 - p) percent of the items take on this value or more – Arrange the data of ‘n’ items in ascending order – Compute index i, the position of the pth percentile – If i is not an integer, round up. The pth percentile is the value in the ith position. – If i is an integer, the pth percentile is the average of the values in positions i and i+1 • I is the position of the p percentile 9 € i = p 100 " # $ % & 'n
  • 10. Measures of Location: 90th Percentile • i = (p/100)n = (90/100)*70 = 63 • It will be the 63rd value • Averaging the 63rd and 64th data values • 90th Percentile = 585 10 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615
  • 11. Measures of Location: 25th Percentile • i = (p/100)n = (25/100)*70 = 17.5 • It will be the 18th value • 25th Percentile = 445 11 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615
  • 12. GDP of UN member countries 12 $267,015 $18,171 0 50,000 100,000 150,000 200,000 250,000 300,000 Mean Median GDP On average, how many legs do English people have?
  • 13. Measures of Dispersion • in choosing supplier A or supplier B we should consider not only the average delivery time for each, but also the variability in delivery time for each A Mean = 0 B Mean = 0 Frequency 13
  • 14. Measures of Dispersion: Range • The range of a data set is the difference between the largest and smallest data values • Range = largest value - smallest value • Range = 615 - 425 = 190 14 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615
  • 15. Measures of Dispersion: Interquartile range • The interquartile range of a data set is the difference between the first and third quartiles • Q1 = 25th percentile = 445 (from before) • Q3 = 75th percentile = 525 • Interquartile range = 525 - 445 = 80 15 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615 € i = p 100 " # $ % & 'n
  • 16. Measures of Dispersion: Variance • The variance is a measure of variability that utilizes all the data. • It is based on the difference between the value of each observation (xi) and the mean (x BAR for a sample, and m for a population) • The variance is the average of the squared differences between each data value and the mean. s 2 = (xi−x )2 ∑ n−1 σ2 = (xi −µ)2 ∑ N 16 SamplePopulation
  • 17. Measures of Dispersion: Standard Deviation • The standard deviation of a data set is the square root of the variance. • It is more easily comparable to the mean than the variance – standard deviation measures the spread about the mean using the original (not squared) scale • It ties into the Normal Distribution € s = s2 € σ = σ2 Sample Population 17 s = (xi −x )2 ∑ n−1
  • 18. What is the standard deviation? 18 i xi xi - x' (xi - x')2 1 425 -65.8 4329.64 2 440 -50.8 2580.64 3 450 -40.8 1664.64 4 465 -25.8 665.64 5 480 -10.8 116.64 6 510 19.2 368.64 7 575 84.2 7089.64 8 430 -60.8 3696.64 9 440 -50.8 2580.64 10 450 -40.8 1664.64 65 450 -40.8 1664.64 66 465 -25.8 665.64 67 480 -10.8 116.64 68 510 19.2 368.64 69 570 79.2 6272.64 70 615 124.2 15425.64 sum 206,735.20 /(n-1) 2,996.16 sqrt 54.74
  • 19. Measures of Dispersion: Examples • Variance • Standard Deviation s2 = (xi −x )2 ∑ n−1 =2,996 74.5429962 === ss 19
  • 20. Measures of Dispersion: z Score • The z - score is the standardised value • It denotes the number of standard deviations a data value xi is from the mean • A data value less than the sample mean will have a z-score less than zero • A data value greater than the sample mean will have a z -score greater than zero € zi = xi − x s 20
  • 21. Measures of Dispersion: z Score for smallest value • z-Score of Smallest Value (425) • Standardized Values for Apartment Rents: z x x s i= - = - = - 425 490 80 54 74 1 20 . . . 21 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615
  • 22. Measures of Dispersion: Outliers • An outlier is an unusually small or unusually large value in a data set – It might be an incorrectly recorded data value – It might be a data value that was incorrectly included in the data set – It might be a correctly recorded data value that belongs in the data set! • A data value with a z-score less than -3 or greater than +3 might be considered an outlier 22
  • 23. Summary • There are two main ways to get feel for a set of numbers (a distribution) – location and dispersion • The mean and the standard deviation are the most frequent measures of location and dispersion but it’s important to understand the alternatives 23
  • 25. Measures of Location: Mean • The mean of a data set is the summation of all individual values, divided by the number of observations SamplePopulation 25Notice the use of Greek/upper case for populations and Latin/lower case for samples x = xi∑ n = 34,356 70 = 490.80 € µ = x∑ i N € x = x∑ i n
  • 26. Measures of Location: Median • The median of a data set is the value that divides the lower half of the distribution from the higher half • The median is the middle observation – i.e. the (n+1)/2th observation – In this case, 71/2 = 35.5th observation • If there are an even number of observations, take the mean of both middle values • Median = 475 26 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615
  • 27. Measures of Location: Mode • The mode of a data set is the value that occurs with the greatest frequency. • If the data have exactly two modes, the data are bimodal • If the data have more than two modes, the data are multimodal • Mode = 450 27 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615
  • 28. Measures of Location: 90th Percentile • i = (p/100)n = (90/100)*70 = 63 • It will be the 63rd value • Averaging the 63rd and 64th data values • 90th Percentile = 585 28 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615
  • 29. Measures of Location: 25th Percentile • i = (p/100)n = (25/100)*70 = 17.5 • It will be the 18th value • 25th Percentile = 445 29 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615
  • 30. Measures of Dispersion: Range • The range of a data set is the difference between the largest and smallest data values • Range = largest value - smallest value • Range = 615 - 425 = 190 30 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615
  • 31. Measures of Dispersion: Interquartile range • The interquartile range of a data set is the difference between the first and third quartiles • Q1 = 25th percentile = 445 (from before) • Q3 = 75th percentile = 525 • Interquartile range = 525 - 445 = 80 31 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615 € i = ( p 100 )n
  • 32. Measures of Dispersion: z Score for smallest value • z-Score of Smallest Value (425) • Standardized Values for Apartment Rents: z x x s i= - = - = - 425 490 80 54 74 1 20 . . . 32 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615
  • 33. • This presentation forms part of a free, online course on analytics • http://econ.anthonyjevans.com/courses/analytics/ 33