-
If you take a variable with a normal distribution and square it, then you get a chi-square distribution
-
Pearson Chi-Square Test
-
Example: You survey 80 students from a university
-
Event A: 30 students are men
-
Event B: 50 students are women
-
Mutually exclusive – all events have to add up to the total
-
We can test the hypothesis that men and women occur 50/50 in a group
-
Expected value for men is 40
-
Expected value for women is 40
-
Compute the X 2 statistic
-
Notation
-
O i is the observed data
-
E i is the expected
-
n is the outcomes of each event
-
The degrees of freedom are df = n – 1 = 2 – 1 = 1
-
Chi-square test statistic is
-
Reject the H 0
-
Test statistic is calculated in Excel using =chiinv( a, df)
-
Note – Chi-squares are one-tail tests, because negative numbers are converted to positive when they are squared
-
Contingency tables
-
The simplest is called a 2 X 2
-
Example: Students took an exam
Occurrences |
Failed |
Passed |
Marginal Total |
Male |
20 |
30 |
50 |
Female |
10 |
20 |
30 |
Marginal Total |
30 |
50 |
80 |
-
You have to use the number of occurrences
-
Have to calculate the expected occurrences from the marginals
Expected |
Failed |
Passed |
Marginal Total |
Male |
|
|
50 |
Female |
|
|
30 |
Marginal Total |
30 |
50 |
80 |
-
The degrees of freedom are df = (columns – 1)(rows – 1) = 1 (1) = 1
-
Chi-square test statistic is
-
Fail to reject the H 0 hypothesis and conclude males and females are equal when taking the exam
-
Note – There is a fast way to calculate X 2 for a 2 X 2 Contingency Table
Occurrences |
Failed |
Passed |
Marginal Total |
Male |
a |
b |
a + b |
Female |
c |
d |
c + d |
Marginal Total |
a + c |
b + d |
a + b + c +d |
-
Re-doing the example using the fast method
-
Note – You can build contingency tables with any dimensions
-
The Chi-square test may be poor if
-
Grand total is less than 100
-
Or a cell total is less than 10
-
Then you should use Yate’s Correction
|