This document discusses various methods of processing raw data collected during research. It defines key terms like editing, coding, classification, and tabulation. Editing involves examining raw data to detect and correct errors. Coding assigns numerals or symbols to categorize responses for analysis. Classification groups data into common categories based on attributes or class intervals. Tabulation summarizes data in an organized table format for further analysis. Proper data processing through these methods is important for obtaining reliable results from statistical analysis.
2. UNIT IV
Data presentation and Analysis – Data Processing –
Methods of Statistical analysis and interpretation of
Data – Testing of Hypothesis and theory of inference.
3. INTRODUCTION
The data, after collection, has to be prepared
for analysis. The collected data is raw and it
must be converted to the form that is suitable
for the required analysis. The result of the
analysis are affected a lot by the form of the
data. So, proper data preparation is must to
get reliable result.
4. MEANING OF PROCESSING
• Processing refers to subjecting the data collected to a
process in which, the accuracy, completeness, uniformity
of entries and consistency of information gathered are
examined.
• It is a very important stage before the data is analyzed.
5.
6. Editing
• Editing means to rectify or to set to order or to correct or to establish
sequence.
• Editing of data is a process of examining the collected raw data
(specially in surveys) to detect errors and omissions and to correct
these when possible.
7. FIELD EDITING
• Field editing refers to the performance of the editing
immediately in the field where data is collected. That is as soon
as the investigator collects the data, the data is edited.
• The advantage of this is to correct the data at the stage when it
is collected.
• The nature of editing will depend upon the method of data
collection.
8. IN-HOUSE/CENTRALIZED EDITING
• Editing is done by a person or a team after all the recorded questionnaire/schedules are
collected.
• In such editing, normally the instructions regarding editing are printed and circulated to the
person or the team doing the editing. This is only to ensure that there is uniformity in
editing.
• All the corrections are carried out at one stretch in all the questionnaires or schedules.
Sometimes, the respondents may have to be contacted for clarifying certain points.
9. Guidelines for Editing
i. The editor should follow the instructions for editing accurately it.
ii. The editing process should be free from any personal bias.
iii. Editor’s knowledge of the subject matter will help to perform the job better and so the background
information about the data should be provided to him.
iv. Wrong entries must be marked and scored of, but not defaced.
v. Editorial notes on the work done is important as it would indicate the basis of rectifying wrong entries.
vi. Colour pencil or pen should be used for editing.
vii. Editor’s signature and date of editing should be clearly noted.
10. CODING
• Coding refers to the process of assigning numerals or other symbols to
answers so that responses can be put into a limited number of
categories or classes.
• Coding is necessary for efficient analysis and through it the several
replies may be reduced to a small number of classes which contain the
critical information required for analysis.
• Coding decisions should usually be taken at the designing stage of the
questionnaire or any other collection tool.
• It makes it possible to pre-code the questionnaire choices and which in
turn is helpful for computer tabulation as one can straight forward key
punch from the original questionnaires
12. CLASSIFICATION
• Classification of the data implies that the collected raw data is
categorized into common group having common feature.
• Data having common characteristics are placed in a common group.
• The entire data collected is categorized into various groups or classes,
which convey a meaning to the researcher.
• Classification is done in two ways:
Classification according to attributes.
Classification according to the class intervals.
13. CLASSIFICATION ACCORDING TO THE ATTRIBUTES:
• Data are classified on the basis of common characteristics which
can either be descriptive like literacy, sex, honesty, marital status
etc., or numerical like weight, height, income etc.
• Descriptive features are qualitative in nature and cannot be
measured quantitatively but are kindly considered while making an
analysis.
• Analysis used for such classified data is known as statistics of
attributes and the classification is known as the classification
according to the attributes.
14. CLASSIFICATION ACCORDING TO THE CLASS
INTERVAL:
• The numerical feature of data can be measured quantitatively
and analyzed with the help of some statistical unit like the data
relating to income, production, age, weight etc.,. come under
this category. This type of data is known as statistics of
variables and the data is classified by way of intervals.
15.
16. TABULATION
• Tabulation is the process of summarizing raw data and displaying the same in
compact form (i.e., In the form of statistical tables) for further analysis.
• According to L.R. Corner “ Tabulation is the orderly and systematic presentation of
numerical data in a form designed to elucidate the problem under consideration”.
• In A broader sense, tabulation is an orderly arrangement of data in columns and
rows.
• Tabulation is essential because of the following reasons
• It conserves space and reduces explanatory and descriptive statement to a
minimum.
• It facilitates the process of comparison.
• It facilitates the summation of items and the detection of errors and omissions.
• It provides a basis for various statistical computations.
17. PRINCIPLES OF TABULATION
1) Every table should have a clear, concise and adequate title and this title should always be placed
just above the body of the table.
2) Every table should be given a distinct number to facilitate easy reference.
3) The column headings (captions) and the row headings (stubs) of the table should be clear and
brief.
4) The units of measurement under each heading or sub-heading must always be indicated.
5) Explanatory footnotes, if any, concerning the table should be placed directly beneath the table,
along with the reference symbols used in the table.
6) Source or sources from where the data in the table have been obtained must be indicated just
below the table.
7) Usually the columns are separated from one another by lines which make the table more readable
and attractive.
18. 8) Those columns whose data are to be compared should be kept side by side. Similarly,
percentages and/or averages must also be kept close to the data.
9) It is generally considered better to approximate figures before tabulation as the same would
reduce unnecessary details in the table itself.
10) It is important that all column figures be properly aligned. Decimal points and (+) or (–) signs
should be in perfect alignment.
11) Abbreviations should be avoided to the extent possible and ditto marks should not be used in
the table.
12) Miscellaneous and exceptional items, if any, should be usually placed in the last row of the
table.
13) The arrangement of the categories in a table may be chronological, geographical, alphabetical
or according to magnitude to facilitate comparison.