This PowerPoint helps students to consider the concept of infinity.
Standard deviation and variance
1. REPRESENTATION AND SUMMARY OF DATA 37
VARIABIL]TY OF DATA
Each of these sets of numbers has a mean of 7 but the spread of each is set is different:
(a) 7,7,7,7,7
(b) 4, 6, 6.5,7.2, 1.t.3
(c) -193, -46,28, 69, 1.77
There is no variability in set (a), but the numbers in set (c) are obviously much more spread
out than those in set (b).
There are various ways of measuring the variability or spread of a distribution, two of which
are described here.
The range
The range is based entirely on the extreme values of the distribution.
Range : highest value - lowest value
In(a)therange=7-7:0
In (b) the range = 1.1.3 - 4 = 7.3
In (c) the range = 177 - (-1,93) = 370
Note that there are also ranges based on particular observations within the data and these
percentile and quartile ranges are considered on page 68.
THE STANDARD DEVIATION, S, AND THE VARIANCE, S2
The standard deviation, s, is a very important and useful measure of spread. It gives a measure
of the deviations of the readings from the mean, E. It is calculated using all the values in the
distribution. To calculate s:
o for each reading r, calculate x - x, its deviation from the mean,
o square this deviation to give (x - x)'and note that, irrespective of whether the deviation
was positive or negative, this is now positive,
o find L(x - *)2, the sum of all these values,
o find the average by dividing the sum by n, the number of readings;
L(x - *2
this gives
2* *)
and is known as the variance,
lt
o finally take the positive square root of the variance to obtain the standard deviation, s.
The standard deviation, s, of a set of n numbers, with mean *, is given by
Each of the three sets of numbers on the previous page has mean 7, i.e. * = 7.
(a) For the set 7,7,7,7,7
Since r - -x :7 -7 = O for every reading, s = 0, indicating that there is no deviation from
the mean.
2(x - x)z
n
2. : ::"];St COURSE IN A-LEVEL STATISTICS
(b) For the set 4,6,6.5,7.2,11.3
2(x -x)2 : (4 - 7)2 + g - 7)2+ (5.5 - 7)z + (7.2 - 7), +(11.3 - 7)z = 28.78
trwBt - " .4 (1 d.p.)
! 5 -'
(c) For the set -1,93, -46,28, 69, 177
Z(x - x)2 = (-1,93 - 7)' + (-46 - z)2 + (28 - 7), + 6g - Z)2 + (t77 - 7), = 7 S g94
R;:t Fl s%
'
:
,/--,
:
,/ ,
:123.3 (1 d.p.)
Notice that set (c) has a much high'er standard deviation than set (b), confirming that it is
much more spread about the mean.
Remember that
Standard deviation = rlr*irrr*
Variance = (standard deviation)2
NOTE:
o The standard deviation gives an indication of the lowest and highest values of the data as
follows. In most distributions, the bulk of the distribution lies within two srandard
deviations of the mean, i.e. within the interval x + 2s or (* - 2s, -x + 2s). This helps to give
an idea of the spread of the data.
o The units of standard deviation are the same as the units of the data.
o Standard deviations are useful when comparing sets of data; the higher the standard
deviation, the greater the variability in the data.
Example 1.22
Two machines, A and B, are used to pack biscuits. A random sample of ten packets was taken
from each machine and the mass of each packet was measured to the nearest gram and noted.
Find the standard deviation of the masses of the packets taken in the sample from each
machine. Comment on your answer.
t*," i" sl l+ :l,1.fl$illfrgl.li.i?i.eQgi:i 4.qgii,+$l, aqr, zoz, zos
(mass i* g) 19,2,1;;,,,|1||5,4;.,...1.,9 ;::::19:8,j,::.'efi.0'.'.2:0,I j,,ififi3,',2A4, 2A6, 207
Solution L.22
Machine A x *zx - 'o?o - zaa Machine B
n 10
Since the mean mass for each machine is 200 , x - x - x
Ex
-M-
,)u
-
n
- 200
2000
10
- 200
3. -^l-- *
To calculate s, put the data into
) 2(x - 20q2
I
. 10
: 5.6
,-ffi:2.37 (2 d.p.)
REPRESENTATION AND SUMMARY OF DATA 39
s2
I(x - 20qz
10
:24
s-tffi
:4.gO (2 d.p.)
rL
B-1)
a table:
Machine A: s.d. =2.37 g (2 d.p.) Machine B: s.d. - 4.90 gr(2 d.p,)
Machine A has less variation, indicating that it is more reliable than machine B.
Alternative form of the formula for standard deviation
The formula given above is sometimes difficult to use, especially when r is not an integei, so
an alternative form is often used. This is derived as follows:
^1s2:i>(, -X)z
n
L
- ^
E(xL-2*x+-xz)
n
1
- ^ (Ex' - LxEx +Z-xz)
n
Lxz Ex nxz
:--2X +-
n. n n
Zxz A-l-
-):--/.xx)+x' slnce
n
zxz -)ML
n
Alternative format for standard deviation
w,s- l__x,
n
Ex
-t ,/V
n
4. 40 A CCNCIST COURSE IN A-LEVEL STATISTICS
NOTE: It is useful to remember that
Lc
- x2 canbe thought of as
'the mean of the squares minus thr rq{ur"of the mean,.
Example 1.23
The mean of the five numbers 2,3, 5,6, 8 is 4.8. Calculate the standard deviation.
Solution L.23
MethodLusing s_ r7./__iL
r/ '*
VN
x x--x (x-X)z
2 -2.8 7.84
3 -L.g 3.24
5 A.2 0.04
6 1..2 1.44
8 3.2 fi.24
B4:t.8.0
^2 22.80
J
5
_ 4.56
s-{ffi
- 2.'1.4 (2 d.p.)
The working for metho d 2 is less involved.
^ 138
s2:-_(4.8)'
5
- 4.s6
s-{ffi
- 2.14 (2 d.p.)
L(x - -r.)2
Method2 using s_
24
39
5 i'S
6.. '
8,. ',,,,,;,1$&
Using the calculator to find the standard deviation
The standard deviation canbe found directly using the calculator in SD mode. The numbers
arc entered in the same way as when you arc finding the mean.
To find the standard deviation of the five numbers z, 3, s,6, 8 used in Example 1.23:
,
.h$$
Set SD mode lMOpEl tr
Clear mem.o*igb. l2raErcE
Input data mffi
mmi
uDArAl
trlpAT.t
trDAMI
tr lp-ATil
E ip-iFAl
raffiE:;-
l6llDrlE.ij]*# r@rl
L-qJLW
To obtain
s :2.135 ...
You can check
.f,,.= {,$
fi;,;,.y =,',,1r$
E....
a
= 138
lL= J
Imlp,,-
lstriFrl tr H
iEettr
rcEtr
IR-CI tr
Red letters on third
row of calculator
Ira-fltr
lzilrt E
l5eEtr
I2.aEtr
I].EID
To clear
SD, mode
lMop-El tr IMDEU
5. REPRESENTATION AND SUMMARY OF DATA 4I
When dataare in the form of a frequency distribution, the formula for s is
or in the alternative form
Efi -,s - -+ - x. where .t is the mean.
Consider again the data given in Example 1..1.9, on page 32, which shows the number of
children in 20 families. The meanis2.9.
I$u;ffi ffe.r,,1,;of ;;;e;ffi.1d*e. 1 1,..pe;#...1l.familyr,,,x il
4I'
L 3: 4,,1: }
F,re:fiueffiC#;,::,,fl s #
You could use one
more popular than
Method 1 - using
of these three methods for finding the standard deviation. Method 2 is
Method 1.
1g,' '," ,;i': ; (. ..lr,., ..fi
''X..
.l .'a
I
l3l
4,:
i .
-
..rF
jfi::xii$
0,11.f
["1*
'2t"0:,.r
,,
s..xii6.ft
orif:l
0ill0lt
::::::::::::::
1,..i.ft..ft
414...L
#
..&
fr,,
,
t:,; r! !: r i r !:,
::
! i r i:,: i r: r r i i i r i:i: r:: i r:::: r::::: :::: i l
ff..ffi.*,,*'.s...
:1:..*,..tli:, ::
. :'t.. r4: !.i
':'i),.'.a ll:.. ::1... :
. t;:: i ::: i
I !
i
i:: ! l;t:::: i:: i i:::: i
i: !: i i: !: i I i ! i i i f: :
..l.is.Iis#.,l.
,''),,.,:;fi;rft,""1,,
LL:: l':U :
,,.i:iiiiii:iii:ii:ii::lii:il:ii:i:::it;il::i
.t, r.,4t ,, 4l:':i}. .,
l: I ' 'rl:i.4., ;
.r.:g.a&.9.:
!:lir:iriir:!:lii'ririri:i:ri:rXi:i:l:::ir:r::r::;iir;
lffl 1. o
s2-
Zf(*-2.9)'
>f
29.80
20
: 1.49 .
s - {L49
: 1.22 (2 d.p.)
of the number of children per family is L.22 (2 d.p.).The standard deviation
Method} - using s -
Z f(x - -x)Z
>f
zf (* - -x)'
Lf
z f*' -)- ^/-- !,
Z,f
Lf :2A Zf*r:198
6. : CCNCIST COURST IN A-LEVEL STATISTICS
, Lfxz .- ^. 1
s.: ,r _Q.g)'
Lt
= 1# _ (z.g)'
= L.49
t:"[t.+g
:1..22 (2 d.p)
The standard deviation is 1.22 (2 d.p.), as before.
Method 3 - using the calculator in SD mode.
This time you need to take account of the frequencies, and this is done in exactly the same
way as when finding the mean:
Casio 57i0W85 85ru Sherp
fiffiffi,Tpd€ lMop-EHMop-qFlFs.:lMO-p-EBl2l IMO-pTH-rI
rc64.6.4-q.;d'
W:SW#{qg. * ad${FTtr$S[rt:-l .-, i., , *
", ,*.:., n,,. iz',d 4:lEE, ."
E.,...X.,....tr....
tr,,trtr'
E,E_l_g_]jp?sl
trtrJEl
Input data
Dor,,this i",,.Ih*
or0er,, x x r
ffil..i#...l.
.il.
s... +.... t..tfl,ft.Sil,i,.
5f:*1,fl.$,:
Eft1.{..1.. 18,.'
E#*.*.....,..il.,
'8
.,
E.:.l ii:.ilE.:.ii:.tr:iiffi
E....ffi.,.,,H,iliH:.i
ffi...: l.,.. ..l.: lffi,
ffi'l.. '..'E..l..
.ffi.
mffil,E.ll.H
ffi.*'E.tH
ME.IRdI tr
lEdl@
Red Ietters on third
row of calculator
:lF rll l : l
$ ,.. ,
lrfoDEliE i,::i
Therefore the standard deviation is 1 .22
In a grouped frequency distribution, the
interval, as in the followi.rg example.
Example L.24
(2 d.p.), as before.
mid-interval value is taken as representative of the
30c)
+)
=C
'-
:2sCJ
c-
3)
=:-l
-3
.
U
(j
C
UJ
=crq(I)
lr
7. REPRESENTATION AND SUMMARY OF DATA 43
An intelligence test was taken by 115 candidates. For each candidate the time taken to
complete the test was recorded, and the times were summarised in a histogram (see diagram).
'!7riti down the frequency f.or each of the class intervals O-1,'1,-2,2-3,3-S and 5-10
minutes.
Calculate estimates of the mean and standard deviation of the times taken to complete the
tesr. (c)
Solution 1.24
Frequency = frequency density x interval width. Note that the interval 2-3, f.or exarnple,
represents2(time<3.
To calculate estimates for the mean and standard deviation, use mid-interval values, r.
Zfx 437.5 ^-x-+: *:3.8 (2 s.f.)
>f 11s
- 2.2 (2 s.f.)
The mean time is 3.8 minutes and the standard deviation is 2.2 minutes.
[You could have calculated these directly using the calculator in SD mode. Check them
yourself.l
2238.7 5
If you are given summary information, rather than the raw data or fre(uency distribution, you
cannot use the calculator in SD mode. You will have to use the formulae to calculate the mean
and standard deviation, as in the following example.
Example 1.25
(a) Cartons of orange juice are advertised as containing L litre. A random sample of
100 cartons gave the following results for the volume, r.
Lx = 1,0L.4, Ltcz = L02.83
Calculate the mean and the standard deviation of the volume of orange juice in these
100 cartons.
8. 44 A CONCISE COURSE IN A-LEVEL STATISTICS
(b) A machine is supposed to cut lengths of rod 50 cm long.
A sample of 20 rods gave the followirg results for the length , x.
Z,fx - 9g7, Lf*z : 49 711
(i) Calcul ate, the mean length of the 20 rods.
(ii) Calculate the yariance of the lengths of the 20 rods.
State the units of the variance in your answer.
Solution I.25 ,
(a) Ex :101 .4r1fr2 = 102.83, n - 100
Ex 1.0'1,.4 ,
.'. -x - ;: L00
: 1-014
The mean volume is 1 .0L4 litres.
L02.83Lv-'vv
- 1,.0142 : 0.0101 ...
100
The standard deviation of the volume is 0.0L0 litres (2 s.f.)
(b) 2 f* - 997,2 f*2 - 49 711,2 f : 20
Z, fx 997
0 rt- ' :-:49.85
>f 20
The mean length of the rods is 49.85 cm.
zfxz ^ 49 711
(ii) Yariance :; - x2 :ff - 49.852 : 0.5275
The variance is 0.5275 cm2.
J_
Exercise 1f Mean and standard deviation
L. Do not use the statistical progrnm oru your
calculator for tltis question.
(i) For each of the following sets of numbers,
calculate the mean and the stan dard
deviation . Try using both forms of the
formula for the standard deviation in parts
(a) to (c). In parts (d) to (f) choose one of
the mefiods-
(a) 2,4,5, 6, 8
(b) 6,8,9,11
(c) 17, 14,17,23,29
(d) 5, 1.3, 7,9,16,15
(e) 4.6,2.7,3.1, 0.5, 62
(f) 200, 203, 206,207,209
(ii) Now check your answers usin8 yorrr
calculator in SD (STAT) mode.
2. The table shows the weekly wages in {" of each of
100 factory workers.
(a) Draw a histogram to illustrate this
information.
(b) Calculate the mean wage and the standard
deviation.
9. 7.3. Do this question
(a) without using SD mode,
(b) using SD mode on your calculator.
The score for a round of golf for each of 50 club
members was noted. Find the mean score for a
round and the standard deviation.
SCore, x FrequbHeyr,f
$'$; 2
6:7 5
68 r0
69 1,2
709
7X, 6
724
732
The scores in an IQ test for
shown in the table. Find the
standard deviation.
50 candidates are
mean score and the
Score Frequency
100-10..5 ,Si
fi7 -11,3 13
n4-12A 24
tZ't'*I.2;7 1,,,1.
t28=134 4
The stemplot shows the times, recorded to the
nearest second, of 12 people in a race.
Calculate the mean time and the standard
deviation.
Kay..it :.I .5:i :: U.S :.t5,.:.lsecotd3
66
A vertical line graph for a set of data is shown
below. Calculate the mean and standard
deviation of the data.
REPRESENTATION AND SUMMARY OF DATA 45
The following table shows the duration of
40 telephone calls from an office via the
switchboard.
(a) Obtain an estimate of the mean length of a
telephone call and the standard deviation.
(b) Illustrate the data graphically.
D.UretiOn:,,in',:::m[fifite$, .,Nuffi:b r:,,.::o.f::;lica[l$
(..,.["'ll'.l
:::: :: :::: : ::
:
"'
! !j" i...b
a-l
L-r 15
3 5 5
S;.fl..S ', '4
F.l.,il.'0. ,0
(o6c)
For a set of ten numbers Zx - 290 and
Lxz : 8469. Find the mean and the variance.
For a set of nine numbers I(r - x)z : 234. Find
the standard deviation of the numbers.
For a set of nine numbers I(r - *)z : 60 and
Lxz :285. Find the mean of the numbers.
A group of 20 people played a game. The table
below shows the frequency distribution of their
scores.
Score 1 7. 4 x
Given that the mean score is 5, find
(a) the value of x,
(b) the variance of the distribution.
(C Additional)
12. From the information given about each of the
following sets of data, work out the missing
values in the table:
)-Ex Ex' x
r5i.6 !A.9 1.7
52 $..firl..fi.b ...,. 3i
18 5v 4
At a bird observatory, migrating willow warblers
are caught, measured and ringed before being
released. The histogram below illustrates the
lengths, in millimetres, of the willow warblers
caught during one migration season.
4.
8.
9.
10.
11.
5.
Stem
1
1
1
2
Leaf
23
556
799
01,
6.
(J
alb
=(,(.)
L
'L 4
(a)
(b)
(c)
(d)
13.
10. 46 A CONCISE COURSE IIV A.LEYEL STAT/SI'CS
(a) Explain how the histogram shows that the
/
totrl number of willow warblers caught at
the observatory during the migration season
is 118.
1.6. The speeds of cars passing a speed camera are shown in the histogram'
calculate estimates of the mean speed and the standard deviation.
(b) State briefly how it may be deduced from
the histogram (without any calculation) that
an estimite of the mean length is 111 mm.
e"pt"irr briefly why thi-s value..may notbe
rhe true mean lengih of the willow warblers
caught.
(c) GiuE" that the lengths, x mm,.of the willow
/warblerscaughtduringthismigration.
season were J,,ch that Lx :1'3 099 and
bi : t +SS 506, calculate the standard
deviation of the iengths' (C)
1,4. For a particular set of observationt
'{
- 20
' . ,r
Y; #^:#',:!:o;*'i;#*ile
varues or'che
15. For a given frequency distribution
Lf(x lil' : 182:3;Lf*'= 1025'Zf = 30'
Find the mean of the distribution'
b? 16
=boEC
=G)
;E 12
a>
ct
3u 8
>,o
9a
olY
3.E 4
9bLL
0
a10a
C
o)
E
(J
L
o)
J
88l!
calculations involving the mean and standard deviation
Example 1.25
(a) calculate the mean and the standard deviation of the four numbers 2, 3, 6, 9 '
Speed (m.P.h.)
set of four numkrs, such that the mean is
2.5. Find a and b- & Additional)(b) Two numbers,
increased bY 1
a arrd b, are to be added to this
and the variance is increased bY
L20 125
Length (mm)