SlideShare una empresa de Scribd logo
1 de 40
Lecture Series on
  Biostatistics

                       No. Biostat - 5
                      Date: 18.01.2009


       MEASURES OF CENTRAL
           TENDENCY

             Dr. Bijaya Bhusan Nanda,
         M. Sc (Gold Medalist) Ph. D. (Stat.)
      Topper Orissa Statistics & Economics Services, 1988
              bijayabnanda@yahoo.com
CONTENTS
   Descriptive Measures
   Measure of Central Tendency (CT)
    Concept and Definition
    Mean
    Median
    Mode
   Uses of Different Measures of CT
   Advantages and Disadvantages of Different
    Measures of CT
Enabling Objectives
 To equip the trainee with skills to
  manipulate the data in the form of
  numbers that they encounter as a health
  sciences professional.
 The better able they are to manipulate

  such data, better understanding they will
  have of the environment and forces that
  generate these data.
Descriptive Statistics


 Classification and Tabulation of Data
 Presentation of Data

 Measure of Central Tendency (CT)

 Measure of Shape

 Measure of Dispersion
Measures of Central Tendency

Concept
 Data, in nature, has a tendency

  to cluster around a central Value.
 That central value condenses the

  large mass of data into a single
  representative figure.
 The Central Value can be obtained

  from sample values (Called
  Statistics) and Population
  observations (Called Parameters)
Measures of Central Tendency
Definition
 A measure of central tendency is a
 typical value around which other
 figures congregate.
                   Simpson and Kofka
                   (Statistician)
 Average is an attempt to find an
 single figure to describe a group of
 figures.
              Clark and Schakade
                    (Statistician)
Different Measure of Central Tendency (MCT)

  1.   Mathematical Average
           Arithmetic Mean Simply Mean

           Geometric Mean

           Harmonic Mean

  1.   Positional Average
           Median

           Mode

  1.   Mean, Median and Mode are the
       Most Commonly used MCT in
       Health Science
Characteristics of an Ideal MCT
1.   It should be rigidly defined so that different persons may
     not interpret it differently.
2.   It should be easy to understand and easy to calculate.
3.   It should be based on all the observations of the data.
4.   It should be easily subjected to further mathematical
     treatment.
5.   It should be least affected by the sampling fluctuation .
6.   It should not be unduly affected by the extreme values.
Characteristics of Mean
   Most representative figure for the entire mass of data
   Tells the point about which items have a tendency to
    cluster
   Unduly affected by extreme items (very sensitive to
    extreme values)
   The positive deviations must balance the negative
    deviations (The sum of the deviations of individual
    values of observation from the mean will always add
    up to zero)
   The sum of squares of the deviations about the
    mean is minimum
Arithmetic Mean (AM) or Mean

AM is obtained by summing up all the observations and
Individual data: X1, X2, X3, X4, …………Xn
(n = Total No. of observation)

                X1 + X2 + X3 +…..
     Mean =
                     n

  Example:
        Find the mean size of Tuberculin test of 10
  boys measured in millimeters.
                 3,5,7,7,8,8,9,10,11,12.

           3 + 5 + 7 +7 + 8 + 8 + 9 + 10 + 11 + 12
  Mean =
                           10
        = 80 / 10 = 8 mm
Discrete frequency distribution
      Let x1,x2,x3…… xn be the variate values and f1,f2,f3….fn be their
      corresponding frequencies, then their mean is given by
                    x1.f1 + x2.f2 +x3.f3+……+xn.fn
         Mean =
                         f1+f2+f3+….+fn
Example:
  Find mean days of Confinement after delivery in the following series.
      Days of         No. of       Total days of confinement of
 Confinement ( X)   patients (f)         each group. Xf           Ans. Mean
                                                                  days of
         6               5                     30
                                                                  confinement
         7               4                     28                 = 137 / 18
         8               4                     32                 = 7.61
         9               3                     27
        10               2                     20
       Total             18                    137
Grouped frequency distribution
Let m1, m2,…….mn be the mid-point of the class and
f1, f2,……….fn be the corresponding frequency

Direct method:        Σ fi mi
 Mean: X =
                       Σ fi
, Where mi refers to the mid point of the ith class fi
frequency of the ith class and Σfi = N

 Shortcut method:
                     Σfidi
  Mean = A +
                      N
       Where, A is the assumed mean and di = mi-A.
Example:
  Calculate overall fatality rate in smallpox
  from the age wise fatality rate given below.


 Age Gr in      0-1     2-4     5-9     Above 9
     yr
No. of Cases    150     304     421      170
Fatality rate   35.33   21.38   16.86    14.17
Solution :
          Mean fatality rate= ∑fx = 21305.98 =20.39
                              ∑f      1045


Age Gr. in No. of Cases Fatality Rate                  fx
  year            (f)          (x)

   0-1              150             35.33             5299.5
   2-4              304             21.38         6499.52
   5-9              421             16.86         7098.06
 Above 9            170             14.17             2408.9
  Total             1045            87.74        21305.98
Problem:
      Find the mean weight of 470 infants born in a
hospital in one year from following table.


 Weight 2.0-2.4 2.5-2.9 3.0-3.4 3.5-3.9 4.0-4.4   4.5+
 in Kg
 No. of   17      97      187     135     28          6
 infant
Solution :
Weight Continuo        Mid  di = mi-    f     fd     Sum of
in Kg.  us CI in      Value
                               A                      fd
   X      kg.

2.0-2.4   1.95-2.45    2.2    -1.0     17    -17     -65.5
2.5-2.9   2.45-2.95    2.7    -0.5     97    -48.5
3.0-3.4   2.95-3.45    3.2      0      187     0
3.5-3.9   3.45-3.85    3.7     0.5     135   67.5    +104.5
4.0-4.4   3.95-4.45    4.2      1      28     28
  4.5+       4.45+     4.7     1.5      6      9
Total                                  470            +39

   Mean Weight = w + (Σ fx / Σ f )
   Mean Weight = 3.2 +( 39 / 470 ) = 3.28 Kg
Step deviation method
Ex:- Calculation of the mean from a frequency distribution of
weights of 265 male students at the university of Washington
   CI-Wt)       Cont. CI        f    d    fd         X-A
                                                  d=
    90 - 99      89.5 – 99.5    1    -5    -5         h
   100 – 109    99.5 – 109.5    1    -4    -4
                                                             Σ fd
   110 – 119   109.5 – 119.5    9    -3   -27   X= A +                Xh
   120 – 129   119.5 – 129.5   30    -2   -60                 N
   130 – 139   129.5 – 139.5   42    -1   -42                 99
                                                X= 145 +            x 10
   140 – 149   139.5 – 149.5   66    0      0                265
   150 – 159   149.5 – 159.5   47    1    47      = 145 + (0.3736)( 10 )
   160 – 169   159.5 – 169.5   39    2    78
   170 – 179   169.5 – 179.5   15    3    45      = 145 +0     3.74
   180 – 189   179.5 – 189.5   11    4    44
                                                  = 148.74
   190 -199    189.5 -199.5     1    5      5
   200 – 209   199.5 – 209.5    3    6    18
     Total                     265        99
    N=265
Merits, Demerits and Uses of Mean
                     Merits of Mean:
1.   It can be easily calculated.
2.   Its calculation is based on all the observations.
3.   It is easy to understand.
4.   It is rigidly defined by the mathematical
     formula.
5.   It is least affected by sampling fluctuations.
6.   It is the best measure to compare two or more
     series of data.
7.   It does not depend upon any position.
Demerits of Mean :
1.   It may not be represented in actual data so it
     is theoretical.
2.   It is affected by extreme values.
3.   It can not be calculated if all the
     observations are not known.
4.   It can not be used for qualitative data i.e.
     love, beauty , honesty, etc.
5.   It may lead to fallacious conditions in the
     absence of original observations.
Uses of Mean :
1. It is extremely used in medical statistics.
2. Estimates are always obtained by mean.
Median
Definition:
     Median is defined as the middle most or
 the central value of the variable in a set of
 observations, when the observations are
 arranged either in ascending or in
 descending order of their magnitudes.
     It divides the series into two equal parts.
 It is a positional average, whereas the mean
 is the calculated average.
Characteristic of Median
 A positional value of the variable which
  divides the distribution into two equal
  parts, i.e., the median is a value that
  divides the set of observations into two
  halves so that one half of observations are
  less than or equal to it and the other half
  are greater than or equal to it.
 Extreme items do not affect median and is

  specially useful in open ended frequencies.
 For discrete data, mean and median do not

  change if all the measurements are
  multiplied by the same positive number
  and the result divided later by the same
Computation of Median

For Individual data series:
 Arrange the values of X in ascending or

  descending order.
 When the number of observations N, is odd,

  the middle most value –i.e. the (N+1)/2th
  value in the arrangement will be the median.
 When N is even, the A.M of N/2th and

  (N+1)/2th values of the variable will give the
  median.
Example:
 ESRs of 7 subjects are 3,4,5,6,4,7,5. Find the
 median.
Ans: Let us arrange the values in ascending order.
 3,4,4,5,5,6,7. The 4th observation i.e. 5 is the
 median in this series.
Example:
 ESRs of 8 subjects are 3,4,5,6,4,7,6,7. Find the
 median.
Ans: Let us arrange the values in ascending order.
 3,4,4,5,6,6,7,7. In this series the median is the
 a.m of 4th and 5th Observations
Discrete Data Series

   Arrange the value of the variable in
    ascending or descending order of
    magnitude. Find out the cumulative
    frequencies (c.f.). Since median is the size of
    (N+1)/2th. Item, look at the cumulative
    frequency column find that c.f. which is
    either equal to (N+1)/2 or next higher to
    that and value of the variable corresponding
    to it is the median.
Grouped frequency distribution
   N/2 is used as the rank of the median instead
    of (N+1)/2. The formula for calculating median
    is given as:
       Median = L+ [(N/2-c.f)/f] × I
    Where L = Lower limit of the median class i.e.
    the class in which the middle item of the
    distribution lies.
    c.f = Cumulative frequency of the class
    preceding the media n class
    f = Frequency of the median class. I= class
    interval of the median class.
Median for grouped or interval data
Example : Calculation of the median (x). data          N/2-Cf
represent weights of 265 male students at the   X= L +                ×I
university of Washington                                 f
Class – interval   f    Cumulative frequency               132.5-83
                                                  X= 140 +         x 10
   ( Weight)               f “less than”                     66
    90 - 99         1             1                          49.5
                                                  = 140 +          x 10
   100 – 109        1             2                          66
   110 – 119        9             11
                                                  = 140 + (0.750) ( 10 )
   120 – 129       30             41
   130 – 139       42             83              = 140 +    7.50
   140 – 149       66            149
                                                  = 147.5
   150 – 159       47            196
   160 – 169       39            235
   170 – 179       15            250
   180 – 189       11            261
   190 -199         1            262
   200 - 209        3            265
               N = 265 N/2=132.5
Mode
Definition:
 Mode is defined as the most frequently
 occurring measurement in a set of
 observations, or a measurement of relatively
 great concentration, for some frequency
 distributions may have more than one such
 point of concentration, even though these
 concentration might not contain precisely
 the same frequencies.
Characteristics of Mode
   Mode of a categorical or a discrete numerical variable is that value of
    the variable which occurs maximum number of times and for a
    continuous variable it is the value around which the series has
    maximum concentration.
   The mode does not necessarily describe the ‘most’( for example,
    more than 50 %) of the cases
   Like median, mode is also a positional average. Hence mode is useful
    in eliminating the effect of extreme variations and to study popular
    (highest occurring) case (used in qualitative data)
   The mode is usually a good indicator of the centre of the data only if
    there is one dominating frequency.
   Mode not amenable for algebraic treatment (like median or mean)
   Median lies between mean & mode.
   For normal distribution, mean, median and mode are equal (one and
    the same)
Discrete Data Series
   In case of discrete series, quite often the
    mode can be determined by closely looking
    at the data.
Example. Find the mode -
       Quantity of glucose (mg%) in   First we arrange this data
       blood of 25 students           set in the ascending order
       70   88   95 101       106      and find the frequency.

       79   93   96 101       107

       83   93   97 103       108

       86   95   97 103       112

       87   95   98 106       115
Quantity of       Frequency     Quantity of      Frequency
  glucose (mg%)                   glucose (mg%) in
  in blood                        blood
         70               1               97                2
         79               1               98                1
         83               1               101               2
         86               1               103               2
         87               1               106               2
         88               1               107               1
         93               3               108               1
         95               2               112               1
         96               1               115               1

This data set contains 25 observations. We see that, the value of 93 is
repeated most often. Therefore, the mode of the data set is 93.
Grouped frequency distribution
Calculation of Mode:
               {(f1 – f0)
 Mode= L +                               x I
              [(f1 – f0) + (f1 – f2 )]
      L = lower limit of the modal class i.e.- the class
             containing mode
      f1 = Frequency of modal class
      f0 = Frequency of the pre-modal class
      f2 =Frequency of the post modal class
      I = Class length of the modal class
Example: Find the value of the mode from the data given
                         below
Weight in Kg Exclusive            No of          Weight in        No of
                 CI               students           Kg           students
   93-97         92.5-97.5           2               113-117       14
   98-102       97.5-102.5           5               118-122        6
  103-107       102.5-107.5          12              123-127        3
  108-112       107.5-112.5          17              128-131        1
  Solution: By inspection mode lies in the class 108-112. But the real
  limit of this class is 107.5-112.5
                     {(f1 – f0)
  Mode= L +                                    x I     = 110.63
                    [(f1 – f0) + (f1 – fx )]

     Where, L = 107.5,f1 = 17, f0 = 12, f2 = 14, I = 5
Merits, demerits and use of mode

Merits:
   Mode is readily comprehensible and easy to
    calculate.
   Mode is not at all affected by extreme values,
   Mode can be conveniently located even if the
    frequency distribution has class-intervals of
    unequal magnitude provided the modal class and
    the classes preceding and succeeding it are of the
    same magnitude,
Demerits:
    Mode is ill-defined. Not always possible to find a
     clearly defined mode.
    It is not based upon all the observations,
    It is not amenable to further mathematical
     treatment
    As compared with mean, mode is affected to a
     great extent by fluctuations of sampling,
    When data sets contain two, three, or many modes,
     they are difficult to interpret and compare.
                       Uses
Mode is useful for qualitative data.
Comparison of mean, median and mode
    The mean is rigidly defined. So is the median, and so is the
     mode except when there are more than one value with the
     highest frequency density.
    The computation of the three measures of central tendency
     involves equal amount of labour. But in case of continuous
     distribution, the calculation of exact mode is impossible when
     few observations are given. Even though a large number of
     observations grouped into a frequency distribution are
     available, the mode is difficult to determine. The present
     formula discussed here for computation of mode suffers from
     the drawback that it assumes uniform distribution of frequency
     in the modal class. As regards median, when the number of
     observations are even it is determined approximately as the
     midpoint of two middle items. The value of the mode as well
     as the median can be determined graphically where as the A.M
     cannot be determined graphically.
    The A.M is based upon all the observations. But the median
     and, so also, mode are not based on all the observations
Contd……



   When the A.Ms for two or more series of the variables and the
    number of observations in them are available, the A.M of
    combined series can be computed directly from the A.M of the
    individual series. But the median or mode of the combined series
    cannot be computed from the median or mode of the individual
    series.
   The A.M median and mode are easily comprehensible.
   When the data contains few extreme values i.e. very high or small
    values, the A.M is not representative of the series. In such case the
    median is more appropriate.
   Of the three measures, the mean is generally the one that is least
    affected by sampling fluctuations, although in some particular
    situations the median or the mode may be superior in this respect.
   In case of open end distribution there is no difficulty in calculating
    the median or mode. But, mean cannot be computed in this case.
   Thus it is evident from the above comparison, by and large the
    A.M is the best measure of central tendency.
REFERENCE
   Applied Biostatistical Analysis, W.W.
    Daniel
   Biostatistical Analysis, J.S. Zar
   Mathematical Statistics and data
    analysis, John. A. Rice
   Fundamentals of Applied Statistics,
    S.C Gupta and V. K Kapoor.
   Fundamental Mathematical Statistics,
    S.C. Gupta, V.K. Kapoor
THANK YOU
Outlier
   An observation (or measurement) that is
    unusually large or small relative to the
    other values in a data set is called an
    outlier. Outliers typically are attributable
    to one of the following causes:
   The measurement is observed, recorded,
    or entered into the computer incorrectly.
   The measurements come from a different
    population.
   The measurement is correct, but
    represents a rare event.

Más contenido relacionado

La actualidad más candente (20)

Measurement of central tendency
Measurement of central tendencyMeasurement of central tendency
Measurement of central tendency
 
Standard deviation
Standard deviationStandard deviation
Standard deviation
 
Median
MedianMedian
Median
 
Classification & tabulation of data
Classification & tabulation of dataClassification & tabulation of data
Classification & tabulation of data
 
Chi -square test
Chi -square testChi -square test
Chi -square test
 
Mode
ModeMode
Mode
 
descriptive and inferential statistics
descriptive and inferential statisticsdescriptive and inferential statistics
descriptive and inferential statistics
 
Skewness & Kurtosis
Skewness & KurtosisSkewness & Kurtosis
Skewness & Kurtosis
 
Sampling and its types
Sampling and its typesSampling and its types
Sampling and its types
 
mean median mode
 mean median mode mean median mode
mean median mode
 
Chi squared test
Chi squared testChi squared test
Chi squared test
 
Tabular and Graphical Representation of Data
Tabular and Graphical Representation of Data Tabular and Graphical Representation of Data
Tabular and Graphical Representation of Data
 
01 parametric and non parametric statistics
01 parametric and non parametric statistics01 parametric and non parametric statistics
01 parametric and non parametric statistics
 
Standard deviation
Standard deviationStandard deviation
Standard deviation
 
Classification of data
Classification of dataClassification of data
Classification of data
 
Standard error
Standard error Standard error
Standard error
 
Measures of Dispersion
Measures of DispersionMeasures of Dispersion
Measures of Dispersion
 
Skewness and kurtosis ppt
Skewness and kurtosis pptSkewness and kurtosis ppt
Skewness and kurtosis ppt
 
Correlation ppt...
Correlation ppt...Correlation ppt...
Correlation ppt...
 
Central tendency
Central tendencyCentral tendency
Central tendency
 

Similar a Measures of central tendency

Similar a Measures of central tendency (20)

Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendency
 
Basic statistics
Basic statistics Basic statistics
Basic statistics
 
Measures of Central Tendency.pptx
Measures of Central Tendency.pptxMeasures of Central Tendency.pptx
Measures of Central Tendency.pptx
 
Measures of central tendency mean
Measures of central tendency meanMeasures of central tendency mean
Measures of central tendency mean
 
Statistical analysis by iswar
Statistical analysis by iswarStatistical analysis by iswar
Statistical analysis by iswar
 
Mean, median, and mode ug
Mean, median, and mode ugMean, median, and mode ug
Mean, median, and mode ug
 
Kwoledge of calculation of mean,median and mode
Kwoledge of calculation of mean,median and modeKwoledge of calculation of mean,median and mode
Kwoledge of calculation of mean,median and mode
 
Measures of dispersion
Measures  of  dispersionMeasures  of  dispersion
Measures of dispersion
 
Statistics 3, 4
Statistics 3, 4Statistics 3, 4
Statistics 3, 4
 
Central Tendency.pptx
Central Tendency.pptxCentral Tendency.pptx
Central Tendency.pptx
 
Ch 5 CENTRAL TENDENCY.doc
Ch 5 CENTRAL TENDENCY.docCh 5 CENTRAL TENDENCY.doc
Ch 5 CENTRAL TENDENCY.doc
 
Unit 1 - Measures of Dispersion - 18MAB303T - PPT - Part 2.pdf
Unit 1 - Measures of Dispersion - 18MAB303T - PPT - Part 2.pdfUnit 1 - Measures of Dispersion - 18MAB303T - PPT - Part 2.pdf
Unit 1 - Measures of Dispersion - 18MAB303T - PPT - Part 2.pdf
 
Measures of central tendency.pptx
Measures of central tendency.pptxMeasures of central tendency.pptx
Measures of central tendency.pptx
 
Mean
MeanMean
Mean
 
f and t test
f and t testf and t test
f and t test
 
Statistics For Entrepreneurs
Statistics For  EntrepreneursStatistics For  Entrepreneurs
Statistics For Entrepreneurs
 
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersion
 
central tendency.pptx
central tendency.pptxcentral tendency.pptx
central tendency.pptx
 
Central tendency
Central tendencyCentral tendency
Central tendency
 
Anova.ppt
Anova.pptAnova.ppt
Anova.ppt
 

Más de Southern Range, Berhampur, Odisha

Understanding Stress, Stressors, Signs and Symptoms of Stress
Understanding Stress, Stressors, Signs and Symptoms of StressUnderstanding Stress, Stressors, Signs and Symptoms of Stress
Understanding Stress, Stressors, Signs and Symptoms of StressSouthern Range, Berhampur, Odisha
 

Más de Southern Range, Berhampur, Odisha (20)

Growth and Instability in Food Grain production in Oisha
Growth and Instability in Food Grain production  in OishaGrowth and Instability in Food Grain production  in Oisha
Growth and Instability in Food Grain production in Oisha
 
Growth and Instability of OIlseeds Production in Odisha
Growth and Instability of OIlseeds Production in OdishaGrowth and Instability of OIlseeds Production in Odisha
Growth and Instability of OIlseeds Production in Odisha
 
Statistical thinking and development planning
Statistical thinking and development planningStatistical thinking and development planning
Statistical thinking and development planning
 
CAN PLEASURE BE A JOURNEY RATHER THAN A STOP?
CAN PLEASURE BE A JOURNEY RATHER THAN A STOP?CAN PLEASURE BE A JOURNEY RATHER THAN A STOP?
CAN PLEASURE BE A JOURNEY RATHER THAN A STOP?
 
Monitoring and Evaluation
Monitoring and EvaluationMonitoring and Evaluation
Monitoring and Evaluation
 
Anger Management
Anger ManagementAnger Management
Anger Management
 
A Simple Promise for Happiness
A Simple Promise for HappinessA Simple Promise for Happiness
A Simple Promise for Happiness
 
Measures of mortality
Measures of mortalityMeasures of mortality
Measures of mortality
 
Understanding data through presentation_contd
Understanding data through presentation_contdUnderstanding data through presentation_contd
Understanding data through presentation_contd
 
Nonparametric and Distribution- Free Statistics _contd
Nonparametric and Distribution- Free Statistics _contdNonparametric and Distribution- Free Statistics _contd
Nonparametric and Distribution- Free Statistics _contd
 
Nonparametric and Distribution- Free Statistics
Nonparametric and Distribution- Free Statistics Nonparametric and Distribution- Free Statistics
Nonparametric and Distribution- Free Statistics
 
Simple and Powerful promise for Peace and Happiness
Simple and Powerful promise for Peace and HappinessSimple and Powerful promise for Peace and Happiness
Simple and Powerful promise for Peace and Happiness
 
Understanding Stress, Stressors, Signs and Symptoms of Stress
Understanding Stress, Stressors, Signs and Symptoms of StressUnderstanding Stress, Stressors, Signs and Symptoms of Stress
Understanding Stress, Stressors, Signs and Symptoms of Stress
 
Chi sqr
Chi sqrChi sqr
Chi sqr
 
Simple linear regressionn and Correlation
Simple linear regressionn and CorrelationSimple linear regressionn and Correlation
Simple linear regressionn and Correlation
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Inferential statistics-estimation
Inferential statistics-estimationInferential statistics-estimation
Inferential statistics-estimation
 
Sampling methods in medical research
Sampling methods in medical researchSampling methods in medical research
Sampling methods in medical research
 
Probability concept and Probability distribution_Contd
Probability concept and Probability distribution_ContdProbability concept and Probability distribution_Contd
Probability concept and Probability distribution_Contd
 
Probability concept and Probability distribution
Probability concept and Probability distributionProbability concept and Probability distribution
Probability concept and Probability distribution
 

Último

Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Developmentchesterberbo7
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6Vanessa Camilleri
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Celine George
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 

Último (20)

Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Development
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 

Measures of central tendency

  • 1. Lecture Series on Biostatistics No. Biostat - 5 Date: 18.01.2009 MEASURES OF CENTRAL TENDENCY Dr. Bijaya Bhusan Nanda, M. Sc (Gold Medalist) Ph. D. (Stat.) Topper Orissa Statistics & Economics Services, 1988 bijayabnanda@yahoo.com
  • 2. CONTENTS  Descriptive Measures  Measure of Central Tendency (CT) Concept and Definition Mean Median Mode  Uses of Different Measures of CT  Advantages and Disadvantages of Different Measures of CT
  • 3. Enabling Objectives  To equip the trainee with skills to manipulate the data in the form of numbers that they encounter as a health sciences professional.  The better able they are to manipulate such data, better understanding they will have of the environment and forces that generate these data.
  • 4. Descriptive Statistics  Classification and Tabulation of Data  Presentation of Data  Measure of Central Tendency (CT)  Measure of Shape  Measure of Dispersion
  • 5. Measures of Central Tendency Concept  Data, in nature, has a tendency to cluster around a central Value.  That central value condenses the large mass of data into a single representative figure.  The Central Value can be obtained from sample values (Called Statistics) and Population observations (Called Parameters)
  • 6. Measures of Central Tendency Definition A measure of central tendency is a typical value around which other figures congregate. Simpson and Kofka (Statistician) Average is an attempt to find an single figure to describe a group of figures. Clark and Schakade (Statistician)
  • 7. Different Measure of Central Tendency (MCT) 1. Mathematical Average  Arithmetic Mean Simply Mean  Geometric Mean  Harmonic Mean 1. Positional Average  Median  Mode 1. Mean, Median and Mode are the Most Commonly used MCT in Health Science
  • 8. Characteristics of an Ideal MCT 1. It should be rigidly defined so that different persons may not interpret it differently. 2. It should be easy to understand and easy to calculate. 3. It should be based on all the observations of the data. 4. It should be easily subjected to further mathematical treatment. 5. It should be least affected by the sampling fluctuation . 6. It should not be unduly affected by the extreme values.
  • 9. Characteristics of Mean  Most representative figure for the entire mass of data  Tells the point about which items have a tendency to cluster  Unduly affected by extreme items (very sensitive to extreme values)  The positive deviations must balance the negative deviations (The sum of the deviations of individual values of observation from the mean will always add up to zero)  The sum of squares of the deviations about the mean is minimum
  • 10. Arithmetic Mean (AM) or Mean AM is obtained by summing up all the observations and
  • 11. Individual data: X1, X2, X3, X4, …………Xn (n = Total No. of observation) X1 + X2 + X3 +….. Mean = n Example: Find the mean size of Tuberculin test of 10 boys measured in millimeters. 3,5,7,7,8,8,9,10,11,12. 3 + 5 + 7 +7 + 8 + 8 + 9 + 10 + 11 + 12 Mean = 10 = 80 / 10 = 8 mm
  • 12. Discrete frequency distribution Let x1,x2,x3…… xn be the variate values and f1,f2,f3….fn be their corresponding frequencies, then their mean is given by x1.f1 + x2.f2 +x3.f3+……+xn.fn Mean = f1+f2+f3+….+fn Example: Find mean days of Confinement after delivery in the following series. Days of No. of Total days of confinement of Confinement ( X) patients (f) each group. Xf Ans. Mean days of 6 5 30 confinement 7 4 28 = 137 / 18 8 4 32 = 7.61 9 3 27 10 2 20 Total 18 137
  • 13. Grouped frequency distribution Let m1, m2,…….mn be the mid-point of the class and f1, f2,……….fn be the corresponding frequency Direct method: Σ fi mi Mean: X = Σ fi , Where mi refers to the mid point of the ith class fi frequency of the ith class and Σfi = N Shortcut method: Σfidi Mean = A + N Where, A is the assumed mean and di = mi-A.
  • 14. Example: Calculate overall fatality rate in smallpox from the age wise fatality rate given below. Age Gr in 0-1 2-4 5-9 Above 9 yr No. of Cases 150 304 421 170 Fatality rate 35.33 21.38 16.86 14.17
  • 15. Solution : Mean fatality rate= ∑fx = 21305.98 =20.39 ∑f 1045 Age Gr. in No. of Cases Fatality Rate fx year (f) (x) 0-1 150 35.33 5299.5 2-4 304 21.38 6499.52 5-9 421 16.86 7098.06 Above 9 170 14.17 2408.9 Total 1045 87.74 21305.98
  • 16. Problem: Find the mean weight of 470 infants born in a hospital in one year from following table. Weight 2.0-2.4 2.5-2.9 3.0-3.4 3.5-3.9 4.0-4.4 4.5+ in Kg No. of 17 97 187 135 28 6 infant
  • 17. Solution : Weight Continuo Mid di = mi- f fd Sum of in Kg. us CI in Value A fd X kg. 2.0-2.4 1.95-2.45 2.2 -1.0 17 -17 -65.5 2.5-2.9 2.45-2.95 2.7 -0.5 97 -48.5 3.0-3.4 2.95-3.45 3.2 0 187 0 3.5-3.9 3.45-3.85 3.7 0.5 135 67.5 +104.5 4.0-4.4 3.95-4.45 4.2 1 28 28 4.5+ 4.45+ 4.7 1.5 6 9 Total 470 +39 Mean Weight = w + (Σ fx / Σ f ) Mean Weight = 3.2 +( 39 / 470 ) = 3.28 Kg
  • 18. Step deviation method Ex:- Calculation of the mean from a frequency distribution of weights of 265 male students at the university of Washington CI-Wt) Cont. CI f d fd X-A d= 90 - 99 89.5 – 99.5 1 -5 -5 h 100 – 109 99.5 – 109.5 1 -4 -4 Σ fd 110 – 119 109.5 – 119.5 9 -3 -27 X= A + Xh 120 – 129 119.5 – 129.5 30 -2 -60 N 130 – 139 129.5 – 139.5 42 -1 -42 99 X= 145 + x 10 140 – 149 139.5 – 149.5 66 0 0 265 150 – 159 149.5 – 159.5 47 1 47 = 145 + (0.3736)( 10 ) 160 – 169 159.5 – 169.5 39 2 78 170 – 179 169.5 – 179.5 15 3 45 = 145 +0 3.74 180 – 189 179.5 – 189.5 11 4 44 = 148.74 190 -199 189.5 -199.5 1 5 5 200 – 209 199.5 – 209.5 3 6 18 Total 265 99 N=265
  • 19. Merits, Demerits and Uses of Mean Merits of Mean: 1. It can be easily calculated. 2. Its calculation is based on all the observations. 3. It is easy to understand. 4. It is rigidly defined by the mathematical formula. 5. It is least affected by sampling fluctuations. 6. It is the best measure to compare two or more series of data. 7. It does not depend upon any position.
  • 20. Demerits of Mean : 1. It may not be represented in actual data so it is theoretical. 2. It is affected by extreme values. 3. It can not be calculated if all the observations are not known. 4. It can not be used for qualitative data i.e. love, beauty , honesty, etc. 5. It may lead to fallacious conditions in the absence of original observations. Uses of Mean : 1. It is extremely used in medical statistics. 2. Estimates are always obtained by mean.
  • 21. Median Definition: Median is defined as the middle most or the central value of the variable in a set of observations, when the observations are arranged either in ascending or in descending order of their magnitudes. It divides the series into two equal parts. It is a positional average, whereas the mean is the calculated average.
  • 22. Characteristic of Median  A positional value of the variable which divides the distribution into two equal parts, i.e., the median is a value that divides the set of observations into two halves so that one half of observations are less than or equal to it and the other half are greater than or equal to it.  Extreme items do not affect median and is specially useful in open ended frequencies.  For discrete data, mean and median do not change if all the measurements are multiplied by the same positive number and the result divided later by the same
  • 23. Computation of Median For Individual data series:  Arrange the values of X in ascending or descending order.  When the number of observations N, is odd, the middle most value –i.e. the (N+1)/2th value in the arrangement will be the median.  When N is even, the A.M of N/2th and (N+1)/2th values of the variable will give the median.
  • 24. Example: ESRs of 7 subjects are 3,4,5,6,4,7,5. Find the median. Ans: Let us arrange the values in ascending order. 3,4,4,5,5,6,7. The 4th observation i.e. 5 is the median in this series. Example: ESRs of 8 subjects are 3,4,5,6,4,7,6,7. Find the median. Ans: Let us arrange the values in ascending order. 3,4,4,5,6,6,7,7. In this series the median is the a.m of 4th and 5th Observations
  • 25. Discrete Data Series  Arrange the value of the variable in ascending or descending order of magnitude. Find out the cumulative frequencies (c.f.). Since median is the size of (N+1)/2th. Item, look at the cumulative frequency column find that c.f. which is either equal to (N+1)/2 or next higher to that and value of the variable corresponding to it is the median.
  • 26. Grouped frequency distribution  N/2 is used as the rank of the median instead of (N+1)/2. The formula for calculating median is given as:  Median = L+ [(N/2-c.f)/f] × I Where L = Lower limit of the median class i.e. the class in which the middle item of the distribution lies. c.f = Cumulative frequency of the class preceding the media n class f = Frequency of the median class. I= class interval of the median class.
  • 27. Median for grouped or interval data Example : Calculation of the median (x). data N/2-Cf represent weights of 265 male students at the X= L + ×I university of Washington f Class – interval f Cumulative frequency 132.5-83 X= 140 + x 10 ( Weight) f “less than” 66 90 - 99 1 1 49.5 = 140 + x 10 100 – 109 1 2 66 110 – 119 9 11 = 140 + (0.750) ( 10 ) 120 – 129 30 41 130 – 139 42 83 = 140 + 7.50 140 – 149 66 149 = 147.5 150 – 159 47 196 160 – 169 39 235 170 – 179 15 250 180 – 189 11 261 190 -199 1 262 200 - 209 3 265 N = 265 N/2=132.5
  • 28. Mode Definition: Mode is defined as the most frequently occurring measurement in a set of observations, or a measurement of relatively great concentration, for some frequency distributions may have more than one such point of concentration, even though these concentration might not contain precisely the same frequencies.
  • 29. Characteristics of Mode  Mode of a categorical or a discrete numerical variable is that value of the variable which occurs maximum number of times and for a continuous variable it is the value around which the series has maximum concentration.  The mode does not necessarily describe the ‘most’( for example, more than 50 %) of the cases  Like median, mode is also a positional average. Hence mode is useful in eliminating the effect of extreme variations and to study popular (highest occurring) case (used in qualitative data)  The mode is usually a good indicator of the centre of the data only if there is one dominating frequency.  Mode not amenable for algebraic treatment (like median or mean)  Median lies between mean & mode.  For normal distribution, mean, median and mode are equal (one and the same)
  • 30. Discrete Data Series In case of discrete series, quite often the mode can be determined by closely looking at the data. Example. Find the mode - Quantity of glucose (mg%) in First we arrange this data blood of 25 students set in the ascending order 70 88 95 101 106 and find the frequency. 79 93 96 101 107 83 93 97 103 108 86 95 97 103 112 87 95 98 106 115
  • 31. Quantity of Frequency Quantity of Frequency glucose (mg%) glucose (mg%) in in blood blood 70 1 97 2 79 1 98 1 83 1 101 2 86 1 103 2 87 1 106 2 88 1 107 1 93 3 108 1 95 2 112 1 96 1 115 1 This data set contains 25 observations. We see that, the value of 93 is repeated most often. Therefore, the mode of the data set is 93.
  • 32. Grouped frequency distribution Calculation of Mode: {(f1 – f0) Mode= L + x I [(f1 – f0) + (f1 – f2 )] L = lower limit of the modal class i.e.- the class containing mode f1 = Frequency of modal class f0 = Frequency of the pre-modal class f2 =Frequency of the post modal class I = Class length of the modal class
  • 33. Example: Find the value of the mode from the data given below Weight in Kg Exclusive No of Weight in No of CI students Kg students 93-97 92.5-97.5 2 113-117 14 98-102 97.5-102.5 5 118-122 6 103-107 102.5-107.5 12 123-127 3 108-112 107.5-112.5 17 128-131 1 Solution: By inspection mode lies in the class 108-112. But the real limit of this class is 107.5-112.5 {(f1 – f0) Mode= L + x I = 110.63 [(f1 – f0) + (f1 – fx )] Where, L = 107.5,f1 = 17, f0 = 12, f2 = 14, I = 5
  • 34. Merits, demerits and use of mode Merits:  Mode is readily comprehensible and easy to calculate.  Mode is not at all affected by extreme values,  Mode can be conveniently located even if the frequency distribution has class-intervals of unequal magnitude provided the modal class and the classes preceding and succeeding it are of the same magnitude,
  • 35. Demerits:  Mode is ill-defined. Not always possible to find a clearly defined mode.  It is not based upon all the observations,  It is not amenable to further mathematical treatment  As compared with mean, mode is affected to a great extent by fluctuations of sampling,  When data sets contain two, three, or many modes, they are difficult to interpret and compare. Uses Mode is useful for qualitative data.
  • 36. Comparison of mean, median and mode  The mean is rigidly defined. So is the median, and so is the mode except when there are more than one value with the highest frequency density.  The computation of the three measures of central tendency involves equal amount of labour. But in case of continuous distribution, the calculation of exact mode is impossible when few observations are given. Even though a large number of observations grouped into a frequency distribution are available, the mode is difficult to determine. The present formula discussed here for computation of mode suffers from the drawback that it assumes uniform distribution of frequency in the modal class. As regards median, when the number of observations are even it is determined approximately as the midpoint of two middle items. The value of the mode as well as the median can be determined graphically where as the A.M cannot be determined graphically.  The A.M is based upon all the observations. But the median and, so also, mode are not based on all the observations
  • 37. Contd……  When the A.Ms for two or more series of the variables and the number of observations in them are available, the A.M of combined series can be computed directly from the A.M of the individual series. But the median or mode of the combined series cannot be computed from the median or mode of the individual series.  The A.M median and mode are easily comprehensible.  When the data contains few extreme values i.e. very high or small values, the A.M is not representative of the series. In such case the median is more appropriate.  Of the three measures, the mean is generally the one that is least affected by sampling fluctuations, although in some particular situations the median or the mode may be superior in this respect.  In case of open end distribution there is no difficulty in calculating the median or mode. But, mean cannot be computed in this case.  Thus it is evident from the above comparison, by and large the A.M is the best measure of central tendency.
  • 38. REFERENCE  Applied Biostatistical Analysis, W.W. Daniel  Biostatistical Analysis, J.S. Zar  Mathematical Statistics and data analysis, John. A. Rice  Fundamentals of Applied Statistics, S.C Gupta and V. K Kapoor.  Fundamental Mathematical Statistics, S.C. Gupta, V.K. Kapoor
  • 40. Outlier  An observation (or measurement) that is unusually large or small relative to the other values in a data set is called an outlier. Outliers typically are attributable to one of the following causes:  The measurement is observed, recorded, or entered into the computer incorrectly.  The measurements come from a different population.  The measurement is correct, but represents a rare event.