1. ED 104 Statistics with
Computer Applications)
University of Antique
Sibalom, Antique - 15 February 2021
Chapter 1- The Nature of Statistics
Statistics is used in almost all fields of human endeavor. In sports, a statistics may keep
records of the number of free throws each player in a basketball game, or the number of hits a
baseball player gets in a league. In other areas, such as public health an administrator might
be concerned with the number of residents who contract a new strain of flu virus during a
certain year, just like the agency of the Department of Health is doing right now. In
education, a researcher might want to know if new methods of teaching are better than the
traditional one.
Lesson 1. Descriptive and Inferential Statistics
Statistics is a set of ideas and techniques that enable the user to collect data efficiently. It
is also used as an aid in decision making, like controlling manufacturing processes and to
measure success of those processes. It is also used to calculate premiums on insurance
policies, identifying criminals, finding a new statistical relationship between two or more
variables, to formulate economic policy and to make decisions about trading stocks and
bonds. In short, statistics is the essential science, no matter what your career is an
understanding of statistical methods will help us make decisions effectively.
To further improve our ideas in statistics, at the end of the lesson you will be learning
to:
1. Define statistics
2. Differentiate the types of statistics
Statistics is the science of conducting studies to collect, organize, summarize, analyze,
and draw conclusions from data (Bluman, 2012).
PA 501 - DR. MAGBANUA 1
2. Data can be used in different ways. Statistics are divided into two main areas
depending on how data are used — Descriptive Statistics and Inferential Statistics.
Descriptive statistics consists of the collection, organization, summarization, and
presentation of data. In descriptive statistics, statistician tries to describe a situation by using
Measures of Central Tendency (mean, median, mode); Measures of Variability (range, MAD,
SD, coefficient of variance); Measures of Position (quartiles, quintiles, deciles, percentiles).
For example, getting the census of the population and organizing the data gathered using
graphs and make a generalization about the population (Bluman, 2012, Jabilles et. al., 2018).
Examples of Descriptive Statistics
1. The weekly mean sales of TV sets in a certain store
2. Alcohol is the most frequent disinfectant against COVID-19.
3. At least 5% discount is deducted on the online sale.
4. The rice importation was doubled last year compared to the rice importation two
years ago.
Inferential statistics consists of generalizing from samples to populations, performing
estimations and hypothesis tests, determining relationships among variables, and making
predictions. In this area, statistical tries to infer the characteristics of the population from the
sample. For example, predicting the life span of a newly released automobile model as
compared to the last year’s released. Its prediction depends on the descriptive tools to be
undertaken.
Examples of Inferential Statistics
1. Salary predicts the life satisfaction of businessmen in Antique.
2. Productivity of crops is a factor in determining the choice of students to go into
farming.
3. Awareness of COVID-19 symptoms is directly related to resiliency of the residents
living in Cebu City.
4. Number of received calls predicts the number of orders in a flower shop.
PA 501 - DR. MAGBANUA 2
3. Exercise 01. Classify whether the given situation belongs to the area of descriptive
statistics or statistical inference.
1. Over-all average of student grades in his seven subjects.
2. Effect of music in the production of chickens.
3. To find the more effective method between the lecture type and cooperative learning.
4. The median sale for the month of January.
5. At least 5% of the crops were destroyed by insects.
6. Assuming that 10% of the crops were destroyed by rain this rainy season, we would
expect an increase in the prices of rice by the end of the year.
7. Fifty percent of patients reported to have side-effect in the use of a certain drug.
8. The chance that a person will be robbed in a certain city is 15%.
9. The average number of students in a class at the University of Antique is 45.
10. A recent study showed that eating garlic can lower blood pressure.
PA 501 - DR. MAGBANUA 3
4. Lesson 2. Variables and Types of Data
There are data everywhere, and large data sets need techniques on how to collect and
analyze them. Today, several computer applications can be use in generating information
from the data collected. Collecting useful data of course is not easy, which is one of the major
thing that we will be learning in this module.
In this lesson you will be learning the following:
1. Identify the types of variables as to its nature, and measure
2. Categorize and classify each of the variable collected as to its nature, and measure
3. Identify the level of measurement of variables
Statisticians collect information for variables that describe the situation. A variable is a
characteristic or attribute that can assume different values. This may include: age,
municipality, sex, nationality and so on, which can be found mostly in forms we fill-in.
Data are values (measurements or observations) that the variable can assume. Say for
age, you fill-in 25, and Sibalom for municipality. Both 20 and Sibalom are called data.
A collection of data values forms a data set. Each value in the data set is called a data
value or a datum.
Kinds of Variable
There are basically two kinds of variables— QUALITATIVE and QUANTITATIVE
Variables.
Qualitative variables result in information that describes or categorizes an element of a
population . For example, a sample of four hair-salon customers was surveyed for their “hair
color”, “hometown”, and “level of satisfaction” with the results of their salon treatment.
Arithmetic operations are not meaningful for data that result from a qualitative variables.
Quantitative variables result in information that quantifies an element of a population.
For example, “total cost” of textbooks purchased by each student. Other examples are weight,
height, body temperature, time and many more. Arithmetic operations are meaningful for
data that result from a quantitative variables.
Type of Quantitative Variable as to its Measure
Quantitative variables can further be categorized as to its measure — Discrete and
Continuous.
PA 501 - DR. MAGBANUA 4
5. Discrete variables can be an assigned values such as 0, 1, 2, 3, and the said to be
countable (taken as a whole). Examples are— the number of flowering pots in a garden, the
number of students in a classroom, and the number of calls received.
While, continuous variables by comparison, can assume an infinite number of values
in an interval between any two specific values (taken as part of a whole). Example is the
temperature since the variable can assume an infinite number of values between any two
given temperatures. The classification of variables can be summarized using the diagram
below:
Levels of Measurement
In addition, qualitative and quantitative variables can further be classified using
measurement scales these are — nominal, ordinal, interval, and ratio.
The first level of measurement is called the nominal level of measurement. This
classifies data in categories which are in no logical order and have no particular relationship.
This is also the most basic or lowest level of measurement in which the numerals assigned to
each category stand for the name of the category, but they have no implied order or value.
Examples are: college instructors classified according to subject taught (English,
History, Math, Psychology, or Filipino). Also, classifying as to sex male or female or residents
according to zip codes. Even though numbers are assigned as zip codes, there is no
meaningful order or ranking.
The next of measurement is called the ordinal level. Data measured at this level can be
placed into categories that can be ordered, or ranked.
For example, from students evaluation, their faculty might be ranked according to their
performance as excellent, very satisfactory, satisfactory, fair, and poor. Note that precise
measurement of differences in this level does not exist. For instance, when people are
classified according to their build (small, medium, or large), a large variation exists among
the individuals in each class.
PA 501 - DR. MAGBANUA 5
6. The third level of measurement is called the interval level. This level ranks data and
precise differences between units of measure exist, however there is no meaningful zero.
For example, standardized psychological tests yield values measured on an interval,
such as IQ. There is no meaningful difference of 1 point between an IQ of 109 and an IQ of
110. Temperature is another example of interval measurement. One property is lacking in the
interval scale: There is no true zero. For example, IQ test do not measure people who have no
intelligence. Another is temperature, 0°C does not mean no heat at all.
The highest level of measurement is called the ratio level. This scale posses all the
characteristics of interval measurement, and there exist a true zero.
Examples are those used to measure height, weight, area, and number of phone calls
received. Another way of presenting ratio is when a person can lift 100kilograms and another
can lift 50kilograms, then the ratio between them is 2 to 1 or the first person can lift twice as
much as the second person.
The following table gives the summary for the types of measurements (Lopez, 2002):
Nominal Ordinal Interval Ratio
What does the scale
indicate?
Quality Relative
Quantity
Quantity Quantity
Is there an equal unit
of measurement?
No No Yes Yes
Is there a true zero? No No No Yes
How might the scale
be used in research?
To identify
males and
females as 1
and 2
To judge who is
1st, 2nd, etc., in
aggressiveness
To convey the
results of
intelligence and
personality
tests
To state the
number of
correct
answers on a
test
Guide to determine
the type of scale
Identifying
name or
membership
can be classified
and then
counted
without order
Connotes rank,
degree,
inequalities
Measuring
actual amount
by using
specific unit or
scale of
measurement
Includes all
actual
measurements
PA 501 - DR. MAGBANUA 6
7. Exercises 02. Categorize each of the following data collected by filling-in the table that
follows.
Variables/ Data collected Is the variable
Qualitative or
Quantitative?
If Quantitative, is it
discrete or
continuous?
Identify the level of
measurement of each
variable.
1. Zip codes
2. Student evaluation
(excellent, very good, good,
poor)
3. Salaries of cashiers in a
grocery store
4. Pages of books in the
library
5. Major field of
specialization
6. Diagnosis
7. Income category
8. Street number
9. Enlisted military rank in a
certain compound
10. Species of grass that grow
in the yard.
PA 501 - DR. MAGBANUA 7
8. Lesson 3. Data Collection and Sampling Techniques
One of the most important parts in doing research work is how data are collected and
sampled from the population that produces an inference out of it. In this lesson we will be
learning:
1. Define different terms and identify methods of collecting data
2. Categorize data according to its source
3. Identify different sampling techniques and how to use them
Methods of Collecting Data
Data can be collected in a variety of ways. One of the most common methods is through
the use of surveys.
1. Survey is an investigation of one or more characteristics of a population. Most often,
surveys are done by asking questions either thru self-administered questionnaires or
personal interviews. Nowadays, in the advent of technology, online survey becomes
prevalent.
Census is a kind of survey that gathers facts of interest or pertinent data on every unit
of the population. Sample survey is another method wherein data from a small but
representative cross-section of the population are scientifically collected and analyzed.
2. Observation. Observations make possible the recording of behaviour but only at the
time of occurrence. It is employed when the subjects cannot talk or write. In doing
observation, the researchers make use of their senses and observe the condition in the natural
state rather than communicating with their respondents.
3. Existing Record. Data from published materials like reports, personal files, and
historical records will be utilized.
4. Simulation. A simulation is the use of a mathematical or physical model to reproduce
the conditions of a situation or process. Simulations allow you to study situations that are
impractical or even dangerous to create in real life.
5. Experiment. In performing an experiment, a treatment is applied to part of a
population and responses are observed. Data are obtained under controlled conditions.
PA 501 - DR. MAGBANUA 8
9. Classification of Data as to its Source
Based on the how you collected data it can be categorized either primary or secondary
data.
Surveys using questionnaires, or interviews, are example of primary data. Primary data
are data that are collected directly from subjects or objects of the study.
Examples are direct observation or measurement; by interview using a set questions
called questionnaires or rating scales as guides in collecting objective and measurable data;
by mail or recording or of reporting forms via mails, couriers services, e-mail and fax to reach
out distant data providers; by experimentation to find out cause and effect of a certain
phenomenon; and by registration such as registry of births, deaths, marriages.
While, data from records, computer files, microfilms or internet are examples of
secondary data. Secondary data are data that are previously collected and are found in
publications of both government and non-government institutions, research papers, book,
periodicals, pamphlets and so on.
Sampling
In statistics, it is also important to distinguish between a sample and a population. A
population consists of all subjects (human or otherwise) that are being studied. Most of the
time, due to the expense, time, size of population, medical concerns, and other, it is not
possible to use the entire population for a statistical study; therefore researchers use samples.
For example, scores of entire students of secondary level, or all children of any age who
have older or younger siblings.
A sample is a group of subjects selected from a population. If the subjects of a sample
are properly selected, most of the time they should posses the same or similar characteristics
as that of the subjects in the population which we will be learning in the next module.
For example, scores of students in a class, or the 40 children who actually participated
in one specific study about siblings.
Parameter refers to the numerical description of a population.
Statistic refers to the numerical description of a sample.
Determining the Sample Size
There is no absolute formula in determining the sample size. However, there are
guidelines on how to identify the appropriate sample size depending the known information
about the population under investigation and type of research design used. Here are some of
the guidelines.
PA 501 - DR. MAGBANUA 9
10. A. Slovin’s Formula. This formula is primarily used in the descriptive studies where the
population is known and the margin or error is pre-identified. This is usually done when
using Stratified Sampling Technique. The formula is !
where: n=sample size
N= population size
e=desired margin of error (percent allowance for non-precise because of
the use of the sample instead of the population)
B. According to Gay and Mills (2016), the larger the population size, the smaller the
percentage of the population required to get a representative sample.
• For smaller populations, say, N=100 or fewer, there is little point in sampling; survey
the entire population.If the population size is around 500 (give or take 100), 50%
should be sampled.
• If the population size is around 1,500, 20% should be sampled.
• Beyond a certain point (about N=5,000), the population size is almost irrelevant and a
sample size of 400 will be adequate.
C. According to Frankael and Wallen (2011), the guideline in selecting a sample size will be
as follows:
• For descriptive study 100 would be enough;
• For correlational studies 50 samples can establish relationship if there is any;;
• For experimental or causal-comparative 30 per group is advise but 15 per group is
allowed if the variables are tightly controlled.
Sampling Techniques
In collecting data and information about a particular variable from a large population
most researchers use sample, since sampling saves time and money and in some cases
enables the researcher to get more detailed information about a particular subject. That is
why, sampling cannot be selected in haphazard ways because the information obtained might
be biased.
In drawing samples there are two major sampling designs usually researchers can select
—the random (probability) and non-random (non-probability) sampling designs.
In a random (probability) sampling design, samples are selected by using chance
methods. Each elements to be selected are drawn on the basis of probability or has a certain
probability of being selected as part of the sample.
n =
N
1 + Ne2
PA 501 - DR. MAGBANUA 10
11. While, non-random (non-probability) sampling design on the other hand, is a design
wherein samples are selected on the basis of being judged as “typical”. The person selecting
the sample chooses items that he or she thinks are representative of the population.
Random Sampling (Probability) Techniques
There are different ways on how to select samples in each of this designs. As mentioned
earlier, to avoid bias one should use, random sampling techniques which includes simple
random, systematic, stratified, and cluster sampling techniques.
1. Simple Random (single stage sampling) a sample in this technique is selected
using table of random numbers or using fish-bowl or lottery method.
• Fish bowl or lottery method. This sampling technique can be done by numbering the
subjects, and then place numbered cards in a bowl, mix them thoroughly, and select as
many cards as needed.
• Table of random numbers. This can be done by using a table of generated random
numbers with a computer. To select a random sample say, 15 subjects out of 85 subjects,
it is necessary to number each subject from 01 to 85. Then, select a starting number by
closing your eyes and placing your finger on a number in the table, say your finger
landed on 12 in the second column. Next, proceed downward until you have selected 15
different numbers between 01 and 85. When you reached the bottom of the column, go to
the top of the next column. If you select a number greater than 85 of the number 00 or a
duplicate number, just omit it.
2. Stratified Random Sampling is a process in which certain subgroups, or strata, are
selected for the sample in the same proportion as they exist in the population. Samples
within strata should be randomly selected. The individual within each stratum should be
homogenous (or similar) in some way (Blay, 2013). Two ways using stratified random
sampling are as follow.
• Proportional Allocation can done by choosing sample sizes proportional to the sizes
of the different subgroups or strata (characteristic). Example, in a company there 1000
employees. You decided to stratified the group as to sex — male and female. The data
you collected showed that there are 658 male employees and 342 female employees in
the company. Assuming that the sample size you needed for the study is 100. To
compute the sample needed in each strata, first, get the relative weight of each strata to
PA 501 - DR. MAGBANUA 11
12. the total population, then their respective weights will be multiplied to the desired
number of sample size showing below.
You will be collecting randomly 66 male employees out of 658 male employees
and 34 female employees out of 342 female employees and they will represent as the
respondents of the study you will be conducting.
• Equal Allocation can be done by choosing the same number of samples from each
group regardless of its size. Say the student government president would like to see if
the opinions of the first-year students differ from those the second-year students, if
your sample size required 100 students then, each year-level should have 50
respondents that are randomly selected.
3. Cluster Sampling. The selection of groups, or clusters, of subjects rather than
individuals (sampling unit is a group, rather than individual). This can be done by
dividing the population into groups called clusters by some means such as geographic.
Then researcher randomly selects some of these clusters and uses all members of the
selected clusters as the subjects of the samples.
4. Systematic Sampling. Samples are selected by using every ! th subject. Then the first
subject would be selected at random between 1 and k.
This can be done by dividing the population size by the sample size and round the
result to the nearest whole number, k. Then, use table of random-numbers or a similar
process to obtain a random number between 1 to k. Finally starting from the random number
obtained, select for the sample-those members of the population that are in the nth position.
Say, you need n=100 respondents out of N=1000 in the population, then ! .
Every 10th on the list will be the respondents of the study. However, first respondent should
be selected randomly selected from 1-10 on the list, if you randomly selected, 5th on the list,
Variable (sex) N Relative weight Sample needed (n)
Male
Female
Total !1.00
!658
!
342
1000
= 0.342
!100
!
658
1000
= 0.658
!0.342 * 100 = 34
!0.658 * 100 = 66
!1000
!342
k
1000
100
= 10
PA 501 - DR. MAGBANUA 12
13. then the other respondents are as follows, 15th, 25th, 35th, and so on, until the desired sample
size was met.
5. Multi-stage Sampling. The sample is randomly selected through two or more steps
or stages.
Non-Random (Non-Probability) Sampling Techniques
These are some of the commonly used techniques in a non-random sampling design:
1. Convenience Sampling. Sample is selected from elements of a population that are easily
accessible to the investigator or group of individuals who are available for study.
Example, researcher may interview subjects entering a local mall to determine the nature
of their visit or perhaps what stores they will be patronizing.
2. Snowball Sampling. A special method used when the desired sample characteristic is
rare. This relies on referrals from initial subjects to generate additional subjects.
3. Purposive Sampling. Samples were selected based on criteria which the investigator
think should be in the study. Investigator use their knowledge of the population to judge
whether or not a particular sample will be representative.
4. Accidental Sampling. Samples are picked as it comes to the researcher. Many fields in the
social sciences, like archaeology, history, and medicine, picked samples this way, however,
often incorrectly assumed that the samples picked are typical of the population they come
from.
However, these non-random sampling techniques were not usually utilized in research
studies, unless there are limited/few people or observation in the identified population.
Exercise 03. Classify each of the following situation as simple random, systematic,
stratified, or cluster.
1. In a large school district, all teachers from two buildings are interviewed to
determine whether they believe the students have less homework to do now than in
previous years.
2. Every seventh customer entering a shopping mall is asked to select her or his
favorite store.
3. Nursing supervisors are selected using random numbers to determine annual
salaries.
PA 501 - DR. MAGBANUA 13
14. 4. Mail carriers of a large city are divided into four groups according to sex (male or
female) and according to whether they walk or ride on their routes. Then 30 are selected
from each group and interviewed to determine whether they have been bitten by a dog in
the last year.
PA 501 - DR. MAGBANUA 14