Data and its types
• Data is a set of values of subjects with respect to qualitative or
quantitative variables.
• Data is raw, unorganized facts that need to be processed. Data can be
something simple and seemingly random and useless until it is
organized.
• When data is processed, organized, structured or presented in a given
context so as to make it useful, it is called information.
• Information, necessary for research activities are achieved in different
forms.
Major Approaches
• There are two major approaches to gathering information about a
situation, person, problem or phenomenon.
• When you undertake a research study, in most situations, you need to
collect the required information; however, sometimes the
information required is already available and need only be extracted.
• Based on these broad approaches to information gathering, data can
be categorized as:
• Primary data
• Secondary data.
Primary Data
• Primary data is an original and unique data, which is directly collected
by the researcher from a source according to his requirements.
• It is the data collected by the investigator himself or herself for a
specific purpose.
• Data gathered by finding out first-hand the attitudes of a community
towards health services, ascertaining the health needs of a
community, determining the job satisfaction of the employees of an
organization, and ascertaining the quality of service provided by a
worker are the examples of primary data.
Advantages of using Primary data
• The investigator collects data specific to the problem under study.
• There is no doubt about the quality of the data collected (for the
investigator).
• If required, it may be possible to obtain additional data during the
study period.
Disadvantages of using Primary data
1. The investigator has to contend with all the hassles of data collection-
• deciding why, what, how, when to collect
• getting the data collected (personally or through others)
• getting funding and dealing with funding agencies
• ethical considerations (consent, permissions, etc.)
2. Ensuring the data collected is of a high standard-
• all desired data is obtained accurately, and in the format, it is required in
• there is no fake/ cooked up data
• unnecessary/ useless data has not been included
3. Cost of obtaining the data is often the major expense in studies
Secondary Data
• Secondary data refers to the data which has already been collected
for a certain purpose and documented somewhere else.
• Data collected by someone else for some other purpose (but being
utilized by the investigator for another purpose) is secondary data.
• Gathering information with the use of census data to obtain
information on the age-sex structure of a population, the use of
hospital records to find out the morbidity and mortality patterns of a
community, the use of an organization’s records to ascertain its
activities, and the collection of data from sources such as articles,
journals, magazines, books and periodicals to obtain historical and
other types of information, are examples of secondary data.
Advantages of using Secondary data
• The data is already there- no hassles of data collection
• It is less expensive
• The investigator is not personally responsible for the quality of data
Disadvantages of using Secondary data
• The investigator cannot decide what is collected (if specific data about
something is required, for instance).
• One can only hope that the data is of good quality
• Obtaining additional data (or even clarification) about something is
not possible (most often)
Difference b/w Quantitative & Qualitative Data
• Qualitative data is data concerned with descriptions, which can be
observed but cannot be computed.
• On the contrary, quantitative data is the one that focuses on numbers
and mathematical calculations and can be calculated and computed.
• So, for the collection and measurement of data, any of the two
methods discussed above can be used.
• Although both have its merits and demerits, i.e. while qualitative data
lacks reliability, quantitative data lacks description.
• Both are used in conjunction so that the data gathered is free from
any errors.
• Further, both can be acquired from the same data unit only their
variables of interest are different, i.e. numerical in case of
quantitative data and categorical in qualitative data.