If
you take a variable with a normal distribution and square it, then
you get a chi-square distribution
Pearson
Chi-Square Test
Example:
You survey 80 students from a university
Event
A: 30 students are men
Event
B: 50 students are women
Mutually
exclusive – all events have to add up to the total
We
can test the hypothesis that men and women occur 50/50 in a group
Expected
value for men is 40
Expected
value for women is 40
Compute
the X2 statistic

Notation
Oi
is the observed data
Ei
is the expected
n
is the outcomes of each event

The
degrees of freedom are df = n – 1 = 2 – 1 = 1
Chi-square
test statistic is

Reject
the H0
Test
statistic is calculated in Excel using =chiinv(a,
df)
Note
– Chi-squares are one-tail tests, because negative numbers are
converted to positive when they are squared
Contingency
tables
The
simplest is called a 2 X 2
Example:
Students took an exam
Occurrences |
Failed |
Passed |
Marginal Total |
Male |
20 |
30 |
50 |
Female |
10 |
20 |
30 |
Marginal Total |
30 |
50 |
80 |
You
have to use the number of occurrences
Have
to calculate the expected occurrences from the marginals
Expected |
Failed |
Passed |
Marginal Total |
Male |
 |
 |
50 |
Female |
 |
 |
30 |
Marginal Total |
30 |
50 |
80 |
The
degrees of freedom are df = (columns – 1)(rows – 1) = 1 (1) = 1
Chi-square
test statistic is

Fail
to reject the H0
hypothesis and conclude males and females are equal when taking
the exam
Note
– There is a fast way to calculate X2
for a 2 X 2 Contingency Table
Occurrences |
Failed |
Passed |
Marginal
Total |
Male |
a |
b |
a
+ b |
Female |
c |
d |
c
+ d |
Marginal Total |
a + c |
b + d |
a + b + c +d |

Re-doing
the example using the fast method

Note
– You can build contingency tables with any dimensions
The
Chi-square test may be poor if
Grand
total is less than 100
Or
a cell total is less than 10
Then
you should use Yate’s Correction
|