1. CHAPTER 1
Descriptive Statistics
Objectives:
1. To study the basic introductory concept of statistics, including the
branches of statistics, the basic terms of statistics, and types of
variables.
2. To be able to use graphical and numerical methods to describe a
data set.
3. To be able to find mean, median, mode and standard deviation
for grouped data and ungrouped data.
www.itarosley@blogspot.com
1
4. CHAPTER 1
Descriptive Statistics
Grouped Data
Measurement
of Central
Tendency
Mode
Median
Measurement
of Dispersion
Mean
www.itarosley@blogspot.com
Variance
Std Deviation
4
5. CHAPTER 1
Descriptive Statistics
Definition of basic terms
a) Population consists of all items or elements of interest for a
particular decision or investigation. E.g.: All married staff over
the age of 25 in UTHM.
b) Samples is a certain number of elements that have been chosen
from a population. Sample is a subset of population. E.g.: a list
of married staffs over the age 25 in the Registrar’s Office would
be a sample from the population of all married staffs over the
age of 25 in the UTHM.
c) Random sample is a sample drawn in such a way that each
element of the population has a chance of being selected.
d) Simple random sample implies that any particular sample of a
specified sample size has the same chance of being selected as
any other sample.
www.itarosley@blogspot.com
5
6. CHAPTER 1
Descriptive Statistics
e) Element / number is a specific subject or individual about which
the information is collected.
f) Variable is a characteristic of the individual within the sample or
population
g) Observation / measurement is the value of a variable for an
element.
h) Data set is a collection of values of one or more variables.
i) Ungrouped data set contains information of each number of a
sample or population.
j) Grouped data set is a collection of data which are grouped in
classes.
k) Raw data is data recorded in the sequence in which they are
collected and before they are processed or ranked.
www.itarosley@blogspot.com
6
7. CHAPTER 1
Descriptive Statistics
l) Population parameter is a descriptive measure computed from
a population data.
m) Sample statistic is a descriptive measure computed from a
sample data.
n) Outliers / Extreme Values are values that are very small or very
large relative to the majority of the values in a data set.
www.itarosley@blogspot.com
7
8. CHAPTER 1
Descriptive Statistics
Example
1. The following table gives the number of sales of A4 paper in 8
shops in Melaka.
Shop
Number of A4
Paper (in
reams)
1
2
3
4
5
6
7
8
2000
2500
3000
5000
7000
5000
4000
5500
Elements or
members
www.itarosley@blogspot.com
Variable
Observations
or
measurements
8
9. CHAPTER 1
Descriptive Statistics
Measures of central tendency are statistical measures
which describe the position of a distribution.
They are also called statistics of location, and are the
complement of statistics of dispersion, which provide
information concerning the variance or distribution of
observations.
In the univariate context, the mean, median and mode are
the most commonly used measures of central tendency.
www.itarosley@blogspot.com
9
10. CHAPTER 1
Descriptive Statistics
Mean
- The average of data values
Median
- Middle value in ranked list
- Data must be arranged in increasing or decreasing order.
-Ungrouped data and grouped data
Mode
- Value that occur most frequency
www.itarosley@blogspot.com
10
13. CHAPTER 1
Descriptive Statistics
Median for Ungrouped Data
x(n
Median , M
xn / 2
when n is odd ,
1) / 2 ,
x(n / 2)
1
when n is even
2
www.itarosley@blogspot.com
13
14. CHAPTER 1
Descriptive Statistics
Mode for Ungrouped Data
The frequency of each value in the data set.
•If no value occurs more than once, then the data set has no
mode.
•Otherwise, any value that occurs with the greatest frequency
is a mode of the data set.
www.itarosley@blogspot.com
14
15. CHAPTER 1
Descriptive Statistics
Exercise
1. Find the mean for the price of pen (in RM) below:
2.00 2.50 3.00 3.50 2.50
2. A sample of six students in UTHM is selected and their height is
measured, resulting in the following data:
150.2 cm
1.592 m
149.4 cm
152.7 cm
1.533 m
1.510 m
Find the sample mean.
3. Calculate the mean for the following data:
a) 14, 11, -10, 8, 8, -16
b) 23, 14, 6, -7, -2, 9, 16
www.itarosley@blogspot.com
15
16. CHAPTER 1
Descriptive Statistics
Example
1. Find the median of the following examination scores:
80, 56, 34, 67, 55, 91, 82, 47, 75, 31, 90
2. The following data represent the number of home runs hits by
all teams in the Indian League in 2004.
157 133 189 215 208 139 152 167 202 197 124 239 191
169.
Find the median of this data set.
3. The data below represent the length (in seconds) of a random
sample of songs released in the 90’s.
198 255 287 207 176 224 215 208 241
Find the median of the data given.
www.itarosley@blogspot.com
16
20. CHAPTER 1
Descriptive Statistics
The variance of the n observations is
s
2
( yi
y)
2
( y1 y )
n 1
2
... ( y n
y)
2
n 1
The standard deviation s is the square root of the variance,
s
s
2
www.itarosley@blogspot.com
20
22. CHAPTER 1
Descriptive Statistics
Example:
Find the sample variance for the given data
6.1
5.7
5.8
6.0
5.8
6.3
Find the variance and std deviation of the following data:
5
2
1
7
6
9
www.itarosley@blogspot.com
22
24. Organizing Data
Variable
A characteristic that varies from one
person or thing to another
Quantitative A numerically valued
Qualitative
variable
A non-numerically valued
variable
A quantitative variable
whose possible values
can be listed
Discrete
www.itarosley@blogspot.com
Continuous
A quantitative variable
whose possible values
form some interval of
numbers
24
25. Organizing Data
Grouped frequency distribution
-Is obtained by giving classes or intervals together with the
number of data values in each class.
Cumulative frequency
-Is the frequency of a class that includes all values in a data set
that fall below the upper boundary of that class
Class midpoint or mark
lower lim it Upper lim it
-Is the number halfway between the lower and upper class limits
of a class
2
Class width
-Upper boundary – lower boundary
www.itarosley@blogspot.com
25
26. Organizing Data
Example:
Given the data below:
Construct the frequency distribution table with class limits 42 – 45, 46 –
49, 50 – 53 and so on.
www.itarosley@blogspot.com
26
27. Organizing Data
Construct frequency distribution table and find the class midpoint and class
width.
The ages of its employees in a company
Age
20 – 29
30 – 39
40 – 49
50 – 59
60 – 69
No. of Employees
30
35
20
10
5
www.itarosley@blogspot.com
27
28. The Ministry of Health Malaysia for Health Statistics publishes data
on weights and height by age and sex in Vital and Health Statistics.
The weights shown in Table, given to the nearest tenth of pound, were
obtained from a sample of 18 – 24 – year-old males.
Construct a grouped data table for these weights. Use a class width of
20 and a first cutpoint of 120.
Table 6a: Weights of 37 males, aged 18-24 years
129.2
155.2
167.3
191.1
161.7
278.8
146.4
149.9
185.3
170.0
161.0
150.7
170.1
175.6
209.1
158.6
218.1
151.3
178.7
187.0
165.8
188.7
175.4
www.itarosley@blogspot.com
182.5
187.5
165.0
173.7
214.6
132.1
182.0
142.8
145.6
172.5
178.2
136.7
158.5
173.6
28
29. Grouped Data
Sample Mean
• The sample mean of grouped data is:
n
f i xi
i 1
n
fi
i 1
www.itarosley@blogspot.com
29
30. Grouped Data
The following data shows the number of mistakes that
Redza had done when he typed 100 pages. Find the mean.
No. of mistake/s
No. of pages
0
60
1
21
www.itarosley@blogspot.com
2
10
3
5
4
3
5
1
30
31. Grouped Data
Find the mean for the data below that refers to the
number of bicycles owned by 27 families at Taman
Permata.
No. of bicycles
No. of families
0
2
1
6
2
13
3
4
4
2
www.itarosley@blogspot.com
31
32. Mean , M
Mean is the average of data values
Ungrouped : The sample mean for raw data: Let x1,
x2, ....xn be a sample of size n.
Grouped : The sample mean for grouped data:
Suppose we have a sample of size n grouped into m
groups or cells
32
33. Mean , M
Mean of sample data is
a) Ungrouped data
xi
x
b) Group data
x
fi x i
fi
n
Mean of population data is
a) Ungrouped data
xi
N
where
b) Group data
fi x i
fi
xi = class midpoint / mark = (lower limit – upper limit ) / 2
fi = frequency of xi
33
34. Median, M
Median is the middle value in a ranked list.
The data must be arranged in increasing or decreasing order. The are two type
of median which are median for ungrouped data and median for grouped data.
Ungrouped : The data,
a) when n is odd (ganjil) : the median is the value of (
) th term in ranked
list.
B) when n is even (genap) : the median = average of the value of the two middle
terms
Median of sample data is
a) Ungrouped data
b) Group data
Odd (ganjil)
n
Even (genap)
Median
LM
where
2
F
.C
f median
LM = lower boundary for median class , C = size of class / width,
F = cummulative frequency from classes less than the median class
fm = frequency in the median class ,
n = number of data
34
35. Median
• The median for grouped data is:
n
M
LM
F
C 2
fm
www.itarosley@blogspot.com
35
36. A study of sulphur oxide production within 80
days produced the distribution of the following
table. Find the median.
Sulphur oxide (tonne)
5.0 – 8.9
9.0 – 12.9
13.0 – 16.9
17.0 – 20.9
21.0 – 24.9
25.0 – 28.9
29.0 – 32.9
www.itarosley@blogspot.com
Frequency
3
10
14
25
17
9
2
36
37. Find the median for the data below that shows
the number of visits to the library made by all
the 100 international students in one year.
Number of visits
0-4
5-9
10-14
15-19
20-24
25-29
No. Of students
17
41
22
11
8
1
www.itarosley@blogspot.com
37
38. Mode is the value that occurs most frequently (highest frequency in a
data set)
Grouped Data :
Mode, Mo LM
db
.C
db d a
Note : Group Data
1) Data with 2 mode is known as bimode and more 2 mode is multimode
Mode for data grouped ,
www.itarosley@blogspot.com
38
40. Number of visitors
Number of days
0 – 99
10
100 – 199
23
200 – 299
167
300 – 399
224
400 – 499
211
500 – 599
107
A Global Warming Awareness Exhibition was held by a state
government. The above table recorded the number of visitors
who visited the exhibition and the number of days having
those numbers of visitors. Find the mode of number of
visitors.
41. Find the mean, median and mode for the following data:
Age
Number of people
17 – 21
22 – 26
27 – 31
32 – 36
37 – 41
42 – 46
47 – 51
52 – 56
2
3
5
6
8
7
2
3
www.itarosley@blogspot.com
41
42. Sample Variance
for Grouped Data
The formula for the sample variance for
grouped data is:
S
2
1
f
1
f i xi
2
f i xi
2
f
f is class frequency and X is class midpoint
43. Find the variance and std deviation
Class
Frequency
2 3 4 5 6 7
6 10 15 8 3 10
www.itarosley@blogspot.com
43
44. Find the variance and std deviation
xi
fi
3.0 – 3.4 3.4 – 3.8 3.8 – 4.2 4.2 – 4.6 4.6 – 5.0
4
8
11
9
6
www.itarosley@blogspot.com
44
45. Population variance, σ2
The formula for the sample variance for
grouped data is:
n
( xi
2
)
2
i 1
N
n
N
2
n
xi
2
xi
i 1
i 1
N
2
46. Given the data below:
23.3
12.4
58.1
38.2
14.0
58.2
75.4
23.9
23.9
18.3
22.0
37.1
31.4
8.5
1.0
15.5
6.9
5.2
28.7
26.3
13.9
25.9
26.8
26.9
16.8
37.7
10.6
21.9
31.6
30.1
42.4
16.5
21.1
32.9
8.8
10.6
28.6
40.7
12.9
13.8
a) Construct the frequency distribution table with class boundary -0.5 –
9.5, 9.5 – 19.5, 19.5 – 29.5, and so on.
b) Find
i) Mean
ii) Median
iii) Mode
iv) Standard deviation
47. Class limit
f
20 – 29
30 – 39
40 – 49
50 – 59
60 – 69
30
35
20
10
5
Find the mean, median, mode, standard deviation