2. INTRODUCTION
Correlation is a statistical tool that helps to measure and analyze the degree of relationship
between two variables.
Correlation analysis deals with the association between two or more variables.
The measure of correlation is called correlation coefficient.
The degree of relationship is expressed by a coefficient range from correlation( -1< r > +1).
The direction of change is indicated by a sign
The correlation analysis enables us to have an idea about the degree and direction of
relationship between the two variables under study.
3. DEFINITION
Correlation or co-variation is defined as a statistical study or relationship between two or
more variables or correlation analysis deals with the association between two or more
variables.
Correlation coefficient is the extent or degree of correlation and the degree of relationship
between variables.
Examples: relationship between height and weight, price and demand, age and blood
pressure etc are said to be correlated.
When they are correlated they are said to be interconnected. But correlation doesnot
indicate a cause effect relationship between two variables.
5. I.PERFECT POSITIVE CORRELATION
The two variables will be denoted by letters X and Y.
They are directly proportional and fully related with each other.
The correlation coefficient r = +1.
Both variables rise and fall in the same proportions. This will not be
found in nature.
But some approaching examples are height and weight up to certain
age.
The graph forms a straight line rising from the lower ends of both x and
y axis.
X 90 60 30
Y 60 40 20
X 30 60 90
Y 20 40 60
FIRST TABLE: X AND Y are decreasing proportionally
SECOND TABLE:X AND Y are Increasing proportionally
6. II.PERFECT NEGATIVE CORRELATION
The two variables will be denoted by letters X and Y.
They are inversely proportional with each other.
The correlation coefficient r = -1.
When one variable rises the other variable falls in same proportion. They
are also not seen in nature.
But some approaching examples are mean weekly temperature and number
of colds in winter, pressure and volume of gas at a certain temperature.
The graph forms a straight line rising from the lower ends of both x and y
axis.
X 160 80 40
Y 1 2 3
7. III.MODERATELY/PARTIALLY POSITIVE CORRELATION
In this case the non zero values of co-efficient ‘r’ lies
between 0 and +1 i.e 0<r<1.
Ex: IMR and over crowding, filaria incidence and period of
incidence, age of husband and wife.
In such moderately positive correlation, the scatter will be
there around an imaginary mean line rising from lower
extreme.
8. IV.MODERATELY/PARTIALLY NEGATIVE CORRELATION
In this case the non zero values of co-efficient ‘r’ lies
between -1 and 0 i.e -1 < r < 0 .
Ex: Age and vital capacity in adults. Income and IMR etc.
In such moderately negative correlation, the scatter will be
the same type. But the mean imaginary line rise from
extreme values of one variable.
9. V.OBSOLUTELY NO CORRELATION
Here the value of correlation coefficient is zero, indicating that no
linear relationship exists between the two variables.
There is no mean or imaginery line indicating scatter or trend of
correlation.
‘x’ is completely independent of example.
Ex: height and pulse rate. The magnitude of CC positive or
negative is indicated by the closeness of the data to an
imaginary line indicating.
The correlation will be zero scatter or trend of correlation.
10. OTHER TYPES OF CORRELATION
SIMPLE AND MULTIPLE
PARTIAL AND TOTAL
LINEAR AND NON LINEAR
11. SIMPLE AND MULTIPLE CORRELATION
when we study two variables the relationship is
described as simple correlation.
Ex: Age and weight
If we study more than two variables
simultaneously , we call it as multiple
correlation.
Ex: diet, age and weight
12. PARTIAL AND TOTAL CORRELATION
oThe study of two variables excluding some other
variables is called partial correlation.
◦ Ex: we study age and weight of the body
◦ In total correlation all the factors taken into
account
13. LINEAR AND NONLINEAR/CURVILINEAR CORRELATION
oThe ratio of change between two variables is uniform,
then there will be linear correlation.
oThe graph will be a straight line.
oIn a non linear correlation the amount of change in
one variable does not bear a constant ratio to the
amount of change in the other variables.
oThe graph will be a curve
14. METHODS OF STUDYING CORRELATION
Methods of
correlation
Scatter
diagram
method
Graphic
method
Mathematical
method
Karl Pearson’s
coefficient of
correlation
Spearman’s
Rank
correlation
15. I. SCATTER DIAGRAM METHOD
Simplest device to ascertain if two variables are related is to prepare a dot chart called scatter diagram.
In this the data is plotted on graph paper in the form of dots i.e, for each pair of ‘x’ and ‘y’ values, we put a dot and
obtain as many points as the number of observations.
By looking to the scatter of the various points we can form an idea as to whether the variables are selected or not.
the greater the scatter of the plotted points on the chart , the lesser is the relationship between two variables and vice
versa.
If all the points lie on a straight line falling from the lower left hand corner-perfectly positive
If all the points lying on a straight line rising from upper left hand corner to lower right hand corner of the diagram-
perfectly negative
16. I. SCATTER DIAGRAM METHOD
If the plotted points fall in a narrow band there would be a higher degree of correlation between the variables
If the plotted points lie on a straight line parallel to x-axis or in a haphazard manner, it shows absence of
relationship between the variables.(r=0)
MERITS:
• Simple and non mathematical
• Easily understandable
• Rough idea can be quickly formed if the variables are related.
LIMITATIONS:
• This method will tell the direction of correlation and also if its high or low but cannot tell the degree of
correlation.
17.
18. II. GRAPHIC METHOD
In this method the individual values of the two variables are plotted on the
graph paper. Thus we obtain two curves one for ‘x’ and ‘y’ .
By examining the direction and closeness of the two curves so drawn, we
can infer the relation.
If both the curves drawn on the graph are moving in the same direction ,
correlation is said to be positive and vice versa.
This method is normally used where we are given data over a period of
time
In this method also we cannot get a numerical value describing the extent
to which the variables are related.
19. IIIA). MATHEMATICAL METHOD- KARL PEARSONS CC
Most widely used
Denoted by’r’
Karl pearson is a great biostatistician, who developed correlation analysis.
Where :
X = (𝑥 − 𝑥); ‘x’ is independent variable
Y= (𝑦 − 𝑦) ; ‘y’ is dependent variable
𝑥 and 𝑦 are the means
20. IIIA). MATHEMATICAL METHOD- KARL PEARSONS CC
STEPS:
1. take the deviations of ‘x’ series from the mean of ‘x’ and denote this deviations by ‘X’
2. Square these deviations and obtain the total i.e.,𝛴𝑥2
3. Take the deviations of y series from the mean of ‘y’ and denote these deviations by ‘Y’
4. Square these deviations to obtain the total i.e., 𝑦2
5. Multiply the deviations of ‘x’ and ‘Y’ series and obtain the total i.e.,𝛴𝑋𝑌
6. Substitute the values in the formula.
21. A) KARL PEARSONS CC-PROBLEM-1
Calculate the karl pearsons co-efficient of correlation from the following data and comment over the value
ROLL NO 1 2 3 4 5
MARKS IN ‘A’(X) 48 35 17 23 47
MARKS IN ‘B’(Y) 45 20 40 25 42
22. KARL PEARSONS CC-PROBLEM-1
ROLL NO x y X = (𝑥 − 𝑥) 𝑿2
Y= (𝑦 − 𝑦) 𝑌2 𝑋𝑌
1 48 45 48-34=14 196 45-35=10 100 140
2 35 20 35-34=1 1 20-35=-15 225 -15
3 17 40 17-34=-17 289 40-35=5 25 -85
4 23 25 23-34=11 121 25-35=-10 100 110
5 47 45 47-34=13 169 45-35=10 100 130
170 175 776 550 280
𝑥 =
𝑥
𝑛
=
170
5
= 34
𝑦 =
𝑦
𝑛
=
175
5
= 35
r=
𝛴𝑥𝑦
𝑥2 𝛴𝑦2
r=
280
(776) 550
r=
280
653 r=0.428
Here the relationship is high as it lies
between 0.4-0.8
23. KARL PEARSONS CC-PROBLEM-II
On entry to a school, a new intelligence test is given to a small group of children. The results obtained in that is a
subsequent examination , the marks scored are given below. Find the co-efficient of correlation.
CHILD NUMBER 1 2 3 4 5
Intelligence Number 6 4 6 8 6
Marks in exam 6 6 8 14 6
24. KARL PEARSONS CC-PROBLEM-1
ROLL NO x y X = (𝑥 − 𝑥) 𝑥2
Y= (𝑦 − 𝑦) 𝑌2 𝛴𝑋𝑌
1 6 6 0 0 -2 4 0
2 4 6 -2 4 -2 4 4
3 6 8 0 0 0 0 0
4 8 14 2 4 6 36 12
5 6 6 0 0 -2 4 0
30 40 8 48 16
𝑥 =
𝑥
𝑛
=
30
8
= 6
𝑦 =
𝑦
𝑛
=
40
5
= 8
r=
𝛴𝑥𝑦
𝑥2 𝛴𝑦2
r=
16
(8) 48
r=
16
20 r=0.8
Here the relationship is very high as it lies
between 0.7-1
25. KARL PEARSONS CC- MERITS
MERITS:
• It gives a precise figure of the relationship between variables which can be easily interpreted
• It gives both the direction and degree of relationship
• It is a very popular measure of the study of correlation
•DEMERITS:
• The calculation of CC is time consuming
• Like SD , the value of correlation is unduly affected by extreme items
• The CC lie between -1 to +1 needs a very careful interpretation otherwise it will be misinterpreted.
• It shows only the degree of correlation between two series and not the causes ofrelationship.
26. IIIB) RANK CORRELATION CO-EFFICIENT
In terms in a series have definite values(quantitative) , we can always arrange them in ascending or descending order.
In certain cases , the quantitative measurement of certain values is difficult.
Example:
o Measurement of intelligence
o Female beauty
o Leadership ability
In this case we have to use ranks rather than actual observations. It does not matter, which way items are ranked.
Charles Edward Spearman , a English Psychologist and statistician fount the method to obtain the co-efficient
correlation by ranks in1904. this method is based on ranks.
27. IIIB) RANK CORRELATION CO-EFFICIENT
Rank correlation is applicable only to individual observations. The results is only an approximate one because under
ranking method original values are not taken into account..denoted by a symbol’ 𝜌’ (‘rho’)
The formula is 𝜌 = 1 −
6 (𝛴 ⅆ2)
𝑛 𝑛2−1
𝜌= Rank correlation coefficient
d= difference of ranks between paired items
n= number of observations.
𝛴 ⅆ2
=sum of squares of differences between 2 ranks
The value of 𝜌 lies between -1 to +1
28. IIIB) RANK CORRELATION CO-EFFICIENT-STEPS
when the actual ranks are given:
Compute the difference of two ranks(d)
Square the differences ‘d’and get 𝛴 ⅆ2
Substitute the values in the formula.
WHEN RANKS ARE NOT GIVEN:
When no ranks are given , but actual data is given ,we must give ranks starting one from the highest value to
lowest. ie., first and second rank…….
29. RANK CORRELATION CO-EFFICIENT-I
Shown below is the data relating to measurments of diastolic BP and heart rate in response to the infusion of a
pressure . Use coefficient of rank correlation, and find whether these variables are statistically correlated.
1 2 3 4 5 6 7 8
BP 84 81 87 94 70 90 76 86
H.R 53 63 50 48 77 56 71 63
31. RANK CORRELATION CO-EFFICIENT-II
The marks of 7 candidates in intelligence and mathematics are as follows
candidate A B C D E F G
Intelligence test 30 52 60 62 45 32 41
Mathematics 41 62 70 78 53 45 57
32. RANK CORRELATION CO-EFFICIENT
CANDIDATE RANK RANK IN
MATHS
d d2
A 7 7 0 0
B 3 3 0 0
C 2 2 0 0
D 1 1 0 0
E 4 5 -1 1
F 6 6 0 0
G 5 4 1 1
2
is 𝜌 = 1 −
6𝛴 ⅆ2
𝑛 𝑛2−1
𝜌 = 1 −
6(2)
7(48)
r=0.99
Here the relationship is very high