SlideShare una empresa de Scribd logo
1 de 26
ACHARYA NARENDRA DEVA UNIVERSITY OF AGRICULTURE &
TECHNOLOGY, KUMARGANJ, AYODHYA (U.P.) 224229
Assignment
on
Chi-square test
Course No : STAT-502 4(3+1)
Course name : Statistical methods for applied sciences
Presented to : Presented by :
Dr. Vishal Mehta Vikas Yadav
Assistant Professor Id. No. A-11153/19/22
Department of Agril. Statistics Ph. D. 1st Semester
Soil Science and Agril. Chemistry
Content:
• Introduction
• Properties of Chi-square test
• Limitations of Chi-square test
• Type of Chi-square test
1. Chi-square test for goodness of fit
2. Chi-square test for independence
• Chi-square test for goodness of fit: Example
• Chi-square test for independence: Example
• References
Introduction:
• Chi-square (χ2) test is a statistical method used to
determine whether there is a significant difference
between the observed and expected frequencies in one
or more categories. It is commonly used in various
fields such as medical research, social sciences, and
business to test the goodness of fit, independence, and
association.
Properties
The chi-square test has the following significant properties:
1.If you multiply the number of degrees of freedom by two,
you will receive an answer that is equal to the variance.
2.The chi-square distribution curve approaches the data is
normally distributed as the degree of freedom increases.
3.The mean distribution is equal to the number of degrees of
freedom
Limitations of Chi-Square Test
There are two limitations to using the chi-square test that you
should be aware of.
• The chi-square test, for starters, is extremely sensitive to
sample size. Even insignificant relationships can appear
statistically significant when a large enough sample is used.
Keep in mind that "statistically significant" does not always
imply "meaningful" when using the chi-square test.
• Be mindful that the chi-square can only determine whether
two variables are related. It does not necessarily follow that
one variable has a causal relationship with the other. It would
require a more detailed analysis to establish causality
Types of chi-square test
1. Chi-square test for goodness of fit
2. Chi-square test for independence
Chi-square test for goodness of fit:
• Chi-square test for goodness of fit is used to determine
whether the observed data follow a certain distribution. For
example, if we want to know whether the observed data
follow a normal distribution or not, we can use the chi-square
test for goodness of fit. The test compares the observed
frequencies with the expected frequencies based on the null
hypothesis.
Chi-Square Goodness of Fit Test: Formula
• A Chi-Square goodness of fit test uses the following null and
alternative hypotheses
• H0: (null hypothesis) A variable follows a hypothesized
distribution.
• H1: (alternative hypothesis) A variable does not follow a
hypothesized distribution.
We use the following formula to calculate the Chi-Square test
statistic X2:
X2 = Σ(O-E)2 / E
where:
Σ: is a fancy symbol that means “sum”
O: observed value
E: expected value
Chi-Square test Goodness of Fit Test: Example
• A shop owner claims that an equal number of customers
come into his shop each weekday. To test this hypothesis, an
independent researcher records the number of customers that
come into the shop on a given week and finds the following:
• Monday: 50 customers
• Tuesday: 60 customers
• Wednesday: 40 customers
• Thursday: 47 customers
• Friday: 53 customers
Solution:
We will use the following steps to perform a Chi-Square goodness
of fit test to determine if the data is consistent with the shop
owner’s claim
Step 1: Define the hypotheses.
We will perform the Chi-Square goodness of fit test using the
following hypotheses:
• H0: An equal number of customers come into the shop each day.
• H1: An equal number of customers do not come into the shop
each day.
Step 2: Calculate (O-E)2 / E for each day.
There were a total of 250 customers that came into the shop
during the week. Thus, if we expected an equal amount to
come in each day then the expected value “E” for each day
would be 50
• Monday: (50-50)2 / 50 = 0
• Tuesday: (60-50)2 / 50 = 2
• Wednesday: (40-50)2 / 50 = 2
• Thursday: (47-50)2 / 50 = 0.18
• Friday: (53-50)2 / 50 = 0.18
• Step 3: Calculate the test statistic X2.
X2 = Σ(O-E)2 / E = 0 + 2 + 2 + 0.18 + 0.18 = 4.36
• Step 4: Calculate the p-value of the test statistic X2.
The p-value associated with X2 = 4.36 and n-1 = 5-1 = 4
degrees of freedom is 0.359472
Conclusion
• Since this p-value is not less than 0.05, we fail to reject the
null hypothesis. This means we do not have sufficient
evidence to say that the true distribution of customers is
different from the distribution that the shop owner claimed
Chi-square test for independence:
Chi-square test for independence is used to determine whether
there is a relationship between two variables. For example, if
we want to know whether there is a relationship between
smoking and lung cancer, we can use the chi-square test for
independence. The test compares the observed frequencies
with the expected frequencies based on the null hypothesis that
there is no relationship between the two variables.
Assumptions
• Both variables are CATEGORICAL
• Observations are INDEPENDENT
• The COUNT for each category is GREATER THAN 5
• Each count in a category is MUTUALLY EXCLUSIVE
• Data is chosen RANDOMLY
Chi-Square test for independence : Example
• We want to see if age has an impact on what political party
you vote for. We collect a random sample of 135 people and
display it in the following contingency table broken down by
age and political party.
Solution
Hypothesis
Lets start by stating our hypotheses:
• H_0: Age has no impact on the political party you vote for.
The two variables are independent.
• H_1: Age does have an impact on the political party. The two
variables are dependent.
Significance Level and Critical Value
For this example we will use a 5% significance level. As we
have 2 degrees of freedom (using the formula above):
v = (3 - 1) (2 - 1) = 2
Using the significance level, degrees of freedom and Chi-
Square probability table we find our critical value to be 5.991.
This means our Chi-Square statistic needs to be greater than
5.991 in order for us to reject the null hypothesis and the
variables to not be independent
• Calculating Expected Counts
We now need to determine the expected count frequency for each cell in
our contingency table. These are the expected values if the null
hypothesis is true and is calculated using the following formula:
Er,c = nr*nc /nT
Where n_r and n_c are the row and column totals for certain categories
and n_T is the total number of counts.
For example, the expected count for ages 18–30 who voted Liberals is:
E1,1 = 35*90/135 = 23.3
We can then populate the contingency table with these expected values
(in brackets):
Chi-Square Statistic
It is now time to calculate the Chi-Square statistic using the
formula above
Χ2
2=(10-23.3)2/23.3 + (30-30)2/30 + (50–36.7)2/36.7 +
(25–11.7)2/11.7 + (15–15)2/15 + (5–18.3)2/18.3
This equals 37.2
Therefore, our statistic is much greater than the critical value and
so we can reject the null hypothesis
Conclusion
In this article we have described and shown an example of the
Chi-Square test of independence. This test measures if two
categorical variables are dependent on each-other. This is used
in Data Science for Feature Selection where we only want
modelling features that have an effect on the target.
References
1. Wikipedia
2. www.towarddatascience.com
3. www.statology.org
4. Agresti, A. (2018). An introduction to categorical data
analysis. Wiley.
5. Kothari, C. R. (2004). Research methodology: methods and
techniques. New Age International.

Más contenido relacionado

Similar a Chi-square test.pptx

Chi square test evidence based dentistry
Chi square test evidence based dentistryChi square test evidence based dentistry
Chi square test evidence based dentistryPiyushJain163909
 
Section 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docx
Section 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docxSection 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docx
Section 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docxkenjordan97598
 
Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in StatisticsVikash Keshri
 
Chi squared test
Chi squared testChi squared test
Chi squared testvikas232190
 
This is my statistics exam I need help I have been lost this whole s.docx
This is my statistics exam I need help I have been lost this whole s.docxThis is my statistics exam I need help I have been lost this whole s.docx
This is my statistics exam I need help I have been lost this whole s.docxdivinapavey
 
Chi Square Goodness of fit test 2021 (2).pptx
Chi Square Goodness of fit test 2021 (2).pptxChi Square Goodness of fit test 2021 (2).pptx
Chi Square Goodness of fit test 2021 (2).pptxlushomo3
 
Statr session 17 and 18
Statr session 17 and 18Statr session 17 and 18
Statr session 17 and 18Ruru Chowdhury
 
Statr session 17 and 18 (ASTR)
Statr session 17 and 18 (ASTR)Statr session 17 and 18 (ASTR)
Statr session 17 and 18 (ASTR)Ruru Chowdhury
 

Similar a Chi-square test.pptx (20)

chi sqare test.ppt
chi sqare test.pptchi sqare test.ppt
chi sqare test.ppt
 
chapter18.ppt
chapter18.pptchapter18.ppt
chapter18.ppt
 
Chi sqaure test
Chi sqaure testChi sqaure test
Chi sqaure test
 
Hypothsis testing
Hypothsis testingHypothsis testing
Hypothsis testing
 
Chi square test evidence based dentistry
Chi square test evidence based dentistryChi square test evidence based dentistry
Chi square test evidence based dentistry
 
Chi‑square test
Chi‑square test Chi‑square test
Chi‑square test
 
Chapter 15
Chapter 15 Chapter 15
Chapter 15
 
Section 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docx
Section 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docxSection 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docx
Section 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docx
 
Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in Statistics
 
Chi squared test
Chi squared testChi squared test
Chi squared test
 
Unit 3
Unit 3Unit 3
Unit 3
 
Qm 0809
Qm 0809 Qm 0809
Qm 0809
 
Non parametric-tests
Non parametric-testsNon parametric-tests
Non parametric-tests
 
Chisquared test.pptx
Chisquared test.pptxChisquared test.pptx
Chisquared test.pptx
 
This is my statistics exam I need help I have been lost this whole s.docx
This is my statistics exam I need help I have been lost this whole s.docxThis is my statistics exam I need help I have been lost this whole s.docx
This is my statistics exam I need help I have been lost this whole s.docx
 
Chi square mahmoud
Chi square mahmoudChi square mahmoud
Chi square mahmoud
 
UNIT 5.pptx
UNIT 5.pptxUNIT 5.pptx
UNIT 5.pptx
 
Chi Square Goodness of fit test 2021 (2).pptx
Chi Square Goodness of fit test 2021 (2).pptxChi Square Goodness of fit test 2021 (2).pptx
Chi Square Goodness of fit test 2021 (2).pptx
 
Statr session 17 and 18
Statr session 17 and 18Statr session 17 and 18
Statr session 17 and 18
 
Statr session 17 and 18 (ASTR)
Statr session 17 and 18 (ASTR)Statr session 17 and 18 (ASTR)
Statr session 17 and 18 (ASTR)
 

Último

HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 

Último (20)

HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 

Chi-square test.pptx

  • 1. ACHARYA NARENDRA DEVA UNIVERSITY OF AGRICULTURE & TECHNOLOGY, KUMARGANJ, AYODHYA (U.P.) 224229 Assignment on Chi-square test Course No : STAT-502 4(3+1) Course name : Statistical methods for applied sciences Presented to : Presented by : Dr. Vishal Mehta Vikas Yadav Assistant Professor Id. No. A-11153/19/22 Department of Agril. Statistics Ph. D. 1st Semester Soil Science and Agril. Chemistry
  • 2. Content: • Introduction • Properties of Chi-square test • Limitations of Chi-square test • Type of Chi-square test 1. Chi-square test for goodness of fit 2. Chi-square test for independence • Chi-square test for goodness of fit: Example • Chi-square test for independence: Example • References
  • 3. Introduction: • Chi-square (χ2) test is a statistical method used to determine whether there is a significant difference between the observed and expected frequencies in one or more categories. It is commonly used in various fields such as medical research, social sciences, and business to test the goodness of fit, independence, and association.
  • 4. Properties The chi-square test has the following significant properties: 1.If you multiply the number of degrees of freedom by two, you will receive an answer that is equal to the variance. 2.The chi-square distribution curve approaches the data is normally distributed as the degree of freedom increases. 3.The mean distribution is equal to the number of degrees of freedom
  • 5. Limitations of Chi-Square Test There are two limitations to using the chi-square test that you should be aware of. • The chi-square test, for starters, is extremely sensitive to sample size. Even insignificant relationships can appear statistically significant when a large enough sample is used. Keep in mind that "statistically significant" does not always imply "meaningful" when using the chi-square test.
  • 6. • Be mindful that the chi-square can only determine whether two variables are related. It does not necessarily follow that one variable has a causal relationship with the other. It would require a more detailed analysis to establish causality
  • 7. Types of chi-square test 1. Chi-square test for goodness of fit 2. Chi-square test for independence
  • 8. Chi-square test for goodness of fit: • Chi-square test for goodness of fit is used to determine whether the observed data follow a certain distribution. For example, if we want to know whether the observed data follow a normal distribution or not, we can use the chi-square test for goodness of fit. The test compares the observed frequencies with the expected frequencies based on the null hypothesis.
  • 9. Chi-Square Goodness of Fit Test: Formula • A Chi-Square goodness of fit test uses the following null and alternative hypotheses • H0: (null hypothesis) A variable follows a hypothesized distribution. • H1: (alternative hypothesis) A variable does not follow a hypothesized distribution.
  • 10. We use the following formula to calculate the Chi-Square test statistic X2: X2 = Σ(O-E)2 / E where: Σ: is a fancy symbol that means “sum” O: observed value E: expected value
  • 11. Chi-Square test Goodness of Fit Test: Example • A shop owner claims that an equal number of customers come into his shop each weekday. To test this hypothesis, an independent researcher records the number of customers that come into the shop on a given week and finds the following: • Monday: 50 customers • Tuesday: 60 customers • Wednesday: 40 customers • Thursday: 47 customers • Friday: 53 customers
  • 12. Solution: We will use the following steps to perform a Chi-Square goodness of fit test to determine if the data is consistent with the shop owner’s claim Step 1: Define the hypotheses. We will perform the Chi-Square goodness of fit test using the following hypotheses: • H0: An equal number of customers come into the shop each day. • H1: An equal number of customers do not come into the shop each day.
  • 13. Step 2: Calculate (O-E)2 / E for each day. There were a total of 250 customers that came into the shop during the week. Thus, if we expected an equal amount to come in each day then the expected value “E” for each day would be 50 • Monday: (50-50)2 / 50 = 0 • Tuesday: (60-50)2 / 50 = 2 • Wednesday: (40-50)2 / 50 = 2 • Thursday: (47-50)2 / 50 = 0.18 • Friday: (53-50)2 / 50 = 0.18
  • 14. • Step 3: Calculate the test statistic X2. X2 = Σ(O-E)2 / E = 0 + 2 + 2 + 0.18 + 0.18 = 4.36 • Step 4: Calculate the p-value of the test statistic X2. The p-value associated with X2 = 4.36 and n-1 = 5-1 = 4 degrees of freedom is 0.359472
  • 15. Conclusion • Since this p-value is not less than 0.05, we fail to reject the null hypothesis. This means we do not have sufficient evidence to say that the true distribution of customers is different from the distribution that the shop owner claimed
  • 16. Chi-square test for independence: Chi-square test for independence is used to determine whether there is a relationship between two variables. For example, if we want to know whether there is a relationship between smoking and lung cancer, we can use the chi-square test for independence. The test compares the observed frequencies with the expected frequencies based on the null hypothesis that there is no relationship between the two variables.
  • 17. Assumptions • Both variables are CATEGORICAL • Observations are INDEPENDENT • The COUNT for each category is GREATER THAN 5 • Each count in a category is MUTUALLY EXCLUSIVE • Data is chosen RANDOMLY
  • 18. Chi-Square test for independence : Example • We want to see if age has an impact on what political party you vote for. We collect a random sample of 135 people and display it in the following contingency table broken down by age and political party.
  • 19.
  • 20. Solution Hypothesis Lets start by stating our hypotheses: • H_0: Age has no impact on the political party you vote for. The two variables are independent. • H_1: Age does have an impact on the political party. The two variables are dependent.
  • 21. Significance Level and Critical Value For this example we will use a 5% significance level. As we have 2 degrees of freedom (using the formula above): v = (3 - 1) (2 - 1) = 2 Using the significance level, degrees of freedom and Chi- Square probability table we find our critical value to be 5.991. This means our Chi-Square statistic needs to be greater than 5.991 in order for us to reject the null hypothesis and the variables to not be independent
  • 22. • Calculating Expected Counts We now need to determine the expected count frequency for each cell in our contingency table. These are the expected values if the null hypothesis is true and is calculated using the following formula: Er,c = nr*nc /nT Where n_r and n_c are the row and column totals for certain categories and n_T is the total number of counts. For example, the expected count for ages 18–30 who voted Liberals is: E1,1 = 35*90/135 = 23.3 We can then populate the contingency table with these expected values (in brackets):
  • 23.
  • 24. Chi-Square Statistic It is now time to calculate the Chi-Square statistic using the formula above Χ2 2=(10-23.3)2/23.3 + (30-30)2/30 + (50–36.7)2/36.7 + (25–11.7)2/11.7 + (15–15)2/15 + (5–18.3)2/18.3 This equals 37.2 Therefore, our statistic is much greater than the critical value and so we can reject the null hypothesis
  • 25. Conclusion In this article we have described and shown an example of the Chi-Square test of independence. This test measures if two categorical variables are dependent on each-other. This is used in Data Science for Feature Selection where we only want modelling features that have an effect on the target.
  • 26. References 1. Wikipedia 2. www.towarddatascience.com 3. www.statology.org 4. Agresti, A. (2018). An introduction to categorical data analysis. Wiley. 5. Kothari, C. R. (2004). Research methodology: methods and techniques. New Age International.