Tests of statistical significance : chi square and spss

TESTS OF STATISTICAL SIGNIFICANCE:
CHI SQUARE & SPSS
- DR. SNEHA
II YEAR POST GRADUATE
DEPARTMENT OF COMMUNITY MEDICINE

CONTENT
• P VALUE & HYPOTHESIS TESTING
• TESTS OF STATISTICAL SIGNIFICANCE
• WHAT IS CHI SQUARE?
• WHY TO USE CHI SQUARE?
• HOW TO CALCULATE MANUALLY?
• WHAT ARE THE VARIANTS OF CHI SQUARE?
• CHI SQUARE IN SPSS
• ANALYSIS IN CROSS SECTIONAL STUDIES
• SUMMARY

FIRST LETS SEE ABOUT RANDOM ERRORS …………

RANDOM ERRORS
VARIABILITY IN MEASURE DUE TO
CHANCE.
NOT DUE TO IMPROPER STUDY DESIGN
OR DATA COLLECTION ( SYSTEMATIC
ERROR)
RANDOM ERROR= THREAT TO
PRECISION.
SYSTEMATIC ERRORS AFFECT VALIDITY
OF STUDY.

SOURCES OF RANDOM ERROR
1. MEASUREMENT ERROR:
• Solution: Take average of serial measurements instead of a
single score.

2. SAMPLING VARIATION
• Aka sampling variability/sampling error/sampling fluctuation.
• Major source of random error
• When the estimate differs from sample to sample or sample to
population.
• Measured by standard error (SE).
• Mainly occurs due to small sample size
• Small sample size= not a representative of population
• Solution: increase the sample size

CHARACTERISTICS OF RANDOM ERROR
• Unpredictable
• Cannot be eliminated/ avoided
• CAN BE REDUCED !

HOW DO I DEAL WITH THE DIFFERENCE IN ESTIMATES?
ANS: CHECK FOR THE
SIGNIFICANCE OF
DIFFERENCE IN THE
ESTIMATES

TEST OF STATISTICAL SIGNIFICANCE
• Mathematical models by which p value is found
• P value: probability that the observed difference in estimates
occurred due to chance.
• P value is arbitrary.
• Usually fixed at 5% (0.05)

P value: <0.05
• An observed difference is unlikely to be due to chance alone and the
difference is significant.
P value : >0.05
• Observed difference is probably due to chance alone and the
observed difference is not statistically insignificant.

Eg: prevalence of hypertension in males is 12% and females is 9% in a
rural study. Is the difference in prevalence significant?
Ans: do a statistical test-> check for p value-> if ;
P value is <0.05= the difference in prevalence of hypertension between
males and females is statistically significant. Males have higher chance
of having hypertension than females.

LEVELS OF SIGNIFICANCE
If p value;
• Less than or equal to 0.05: significant
• Less than 0.01: very significant
• <0.001: highly significant

NULL & ALTERNATE HYPOTHESIS
NULL HYPOTHESIS:
• H0
• No difference between two groups
• No association between two variables
ALTERNATE HYPOTHESIS:
• H1
• Difference between two groups +
• Association between two variables +

HOW TO TEST THE HYPOTHESIS?
• Using p value ( statistical tests)
• Either accept or reject H0
• Accepting H0: no difference
• Rejecting H0: difference+
• P value is <0.05: reject H0 and accept H1
• P value is >0.05: accept H0 and reject H1

TWO/ ONE TAILED HYPOTHESIS
• Alternate hypothesis can be one or two tailed
• Eg: one tailed: hypertension is more in males than females
• Eg: two tailed: there is difference in hypertension prevalence between
males and females
• Usually two tailed is preferred

ERRORS WHILE TESTING THE HYPOTHESIS
• TYPE I : alpha error
• TYPE II: beta error

TYPE 1 ERROR
• Probability of finding an association when none really exists.
• Giving false positive results.
• Results when H0 is rejected , when it (H0) is actually true.
• False rejection of H0
• Nothing but p value
• P value: probability of making a type 1 error.
• Accepted alpha error is 0.05 or 5%
• SERIOUS ERROR

WHY TYPE 1 IS A SERIOUS ERROR?
• Lets assume a study on effectiveness of metformin use in PCOS
patients.
• H0= no association between metformin use and prognosis of PCOS
• H1= metformin usage improves the prognosis of PCOS.
• TYPE 1 ERROR: falsely rejecting H0

CONSEQUENCES OF TYPE 1 ERROR
• Increased burden for the patient.
• Unnecessary usage of metformin it actually has no effect on
prognosis.
• Economic burden for patient
• Side effects of drugs.
• Major burden to health care system.

TYPE 2 ERROR
• Probability of not finding an association when one actually exists.
• Failing to reject H0, when it is actually false.
• Measured by beta level
• Accepted beta error is 0.2 or 20%

• (1-beta)= power of the study
• Power of the study is largely affected by sample size
• Inadequate sample size: inadequate power to detect real association.
• Whenever there is no association in any study with small sample size,
the finding is inconclusive as the power will be less

BOTH TYPE 1 AND TYPE 2 ERROR ARE IMPORTANT. BUT
EPIDEMIOLOGISTS ARE MORE CONCERNED WITH TYPE 1 ERROR.

TYPES OF STATISTICAL TESTS
PARAMETRIC
Normal distribution of
data
Used only for
quantitative data
More powerful
NON PARAMETRIC
Not normal
distribution
Can be used for both
qualitative &
quantitative data
Less powerful

QUALTITATIVE VARIABLES
TO FIND THE ASSOCIATION BETWEEN TWO OR MORE THAN 2 GROUPS
CHI SQUARE TEST

QUANTITATIVE VARIABLE
TWO GROUPS:
Normal distribution= unpaired t test
Skewed distribution= Man whitney test
BEFORE AFTER STUDY:
Normal distribution= paired t test
Skewed distribution= Wilcoxon test

QUANTITATIVE CONTINUED
THREE GROUPS:
Normal distribution= one way ANOVA
Skewed distribution= Krusal Wallis test
THREE GROUPS ( REPEATED MEASURE):
Normal distribution = repeated ANOVA
Skewed distribution = Friedman’s ANOVA

WHAT IS CHI SQUARE TEST ?
• Non parametric test
• Developed by Karl Pearson
• Used for qualitative data analysis
• Can be used even if sample size is less
Than 30.

WHY TO USE CHI SQUARE TESTS?
1. To test the difference in proportions between 2/more groups.
2. To test the association between variables

STEPS OF CHI SQUARE TEST
Step 1
• Make a contingency table
• Calculate the degree of freedom
• Calculate the expected frequency for each cell.(E=RT*CT/GT)
Step 2
• Take the difference between observed and expected value for each cell and square the difference
Step 3
•Divide the squared value with expected frequency of the respected cell.
•Add all the values of each cell to get x2
• Compare x2 value with table values along the df
• Choose the corresponding p value

LETS DO A CHI SQUARE TEST MANUALLY …………

Eg: 2 X2 TABLE CONSTRUCTED
DISEASE ATTACK RATE DISEASE CURE RATE TOTAL
VACCINATED O=10 O=90 100
UN VACCINATED O=26 O=74 100
TOTAL 36 164 200

DEGREE OF FREEDOM CALCULATED
• (C-1) * (R-1)
• 2 X 2 table : df=1

CHI SQUARE VALUE = 8.670
DF = 1
P VALUE ?

EASY FORMULA !
• Used only for 2x2 table
• Df must be 1.
• n= grand total
x2 = n( ad-bc)2
-----------------
(a+b)(c+d)(a+c)(b+d)
Disease + Disease -
Exposure + a b
Exposure - c d

Common x2 values and p values with df 1
• Only for 2x2 table
X2 value df P value
0.00-3.83 1 >0.05
3.84-6.62 1 <_ 0.05
6.63-10.82 1 <_0.01
>_ 10.83 1 <_0.001

EXCEPTIONS IN CHI SQUARE
1. YATES CORRECTION: done only when any of the cell has expected
value between 5-10. applicable only for 2x2 table
2. FISHER EXACT TEST: done if any of the expected value is less than 5.
applicable only for 2x2 table.
3. If greater than 2x2 table and the expected value is less than 5, small
frequencies can be pooled or combined with that in the next group
or class in table.

LETS DO CHI SQUARE IN SPSS ………………………..

Step 5: select expected option also

HOW TO INTERPRET THE OUTPUT??
• Number of Expected values in any cell less than 5 should be <20%
• In case more than 20% with df 1, use fisher exact test
• Software gives the p value directly for fisher exact test without x2
value.
• In case more than 20% with df >1 use likelihood ratio test result.

• If expected count is more than 20% with df >1
• Use likelihood ratio result.

ANALYSIS FOR CROSS SECTIONAL STUDIES
1. PREVALENCE RATIO = P e/ Pue
2. PREVALENCE ODDS RATIO = ad / bc
3. CHI SQUARE
4. Z TEST FOR PROPORTION :
= observed difference/ SE of dif btwn 2 proportions.
= P1-P2 / SQUARE ROOT ( P1Q1/NI + P2Q2/N2)
If z value is >1.96 : significant at 5%
If z value is >2.54: significant at 1%

SUMMARY
• Random errors are unavoidable but can be reduced
• P value = alpha error= type 1 error = usually 5%
• Beta error = type 2 error= usually 20%
• Power = 1-beta error
• When association is seen -> check the p value
• When association is not seen -> check for the power of the study.
• Use of parametric and non parametric tests depends on the normal
distribution of data.

Tests of statistical significance : chi square and spss

Tests of statistical significance : chi square and spss

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Tests of statistical significance : chi square and spss

Similar a Tests of statistical significance : chi square and spss (20)

Más de Drsnehas2

Más de Drsnehas2 (9)

Último

Último (20)

Tests of statistical significance : chi square and spss