Chi-Squared Test

Spec Mapping — OCR H420 Module 6.1.2 — Patterns of inheritance, content statements covering the chi-squared (χ²) test for analysing genetic crosses, including calculating the statistic, identifying degrees of freedom, and interpreting the result using critical values and p-values (refer to the official OCR H420 specification document for exact wording). The chi-squared test is the formal statistical tool you have already used informally in the linkage, epistasis and dihybrid lessons — here it is treated as the OCR Mathematical Requirement M1.9 in its own right.

The chi-squared test was developed by Karl Pearson in 1900 (paraphrased) as part of the founding of modern statistics. He showed that for count data drawn from a known distribution, the sum of squared deviations divided by expected counts follows a known distribution (the chi-squared distribution) whose tail probabilities are tabulated. Pearson's contribution made it possible for the first time to assess quantitatively whether observed data deviate significantly from a theoretical prediction — a foundational tool in biology, medicine, and the natural sciences.

You have carried out a genetic cross and counted the offspring. The numbers look roughly like a 9:3:3:1 ratio — but not exactly. Is the deviation just chance, or is it real and telling you that Mendel's law does not apply (because of linkage or epistasis, say)? The chi-squared (χ²) test is the statistical tool that lets you answer this question quantitatively. OCR A-Level Biology A specification module 6.1.2(g) requires you to use the chi-squared test to analyse results from genetic crosses, including calculating the value, using degrees of freedom and critical values, and interpreting the outcome.

Key Definitions:

Null hypothesis (H₀) — there is no significant difference between observed and expected values; any difference is due to chance.

Observed (O) — the actual number in each category.

Expected (E) — the number predicted by the null hypothesis.

Degrees of freedom (df) — the number of categories minus 1.

Critical value — the tabulated χ² value above which the null hypothesis is rejected at a chosen probability (usually p = 0.05).

p-value — the probability that the observed deviation (or a more extreme one) would occur by chance alone if H₀ were true.

When to Use the Chi-Squared Test

Chi-squared is used for categorical data (counts, not measurements) arranged in discrete classes. In genetics, typical applications are:

Monohybrid crosses (3:1 expected).
Dihybrid crosses (9:3:3:1 expected).
Sex-linked crosses.
Detecting deviations that suggest linkage or epistasis.

You should not use chi-squared on percentages or on data with very small expected counts (conventionally, each expected value should be at least 5).

The Procedure

Step 1: State the null hypothesis

H₀: the observed offspring numbers fit the expected Mendelian ratio (e.g. 9:3:3:1). Any difference is due to chance.

Step 2: Calculate the expected values

Multiply the total number of offspring by the proportion expected in each class. For a 9:3:3:1 ratio with 160 offspring, expected values are 90, 30, 30, 10.

Step 3: Apply the formula

$\chi^2 = \sum \dfrac{(O - E)^2}{E}$

For each category, compute (O − E)², divide by E, and sum across all categories.

Step 4: Work out degrees of freedom

$\text{df} = n - 1$

where n is the number of categories. For a 9:3:3:1 cross with 4 categories, df = 3. For a 3:1 monohybrid, df = 1.

Step 5: Compare to the critical value

Look up the critical value for df and p = 0.05 (the usual significance threshold in biology).

If χ² calculated < critical value → difference is not significant → accept the null hypothesis.
If χ² calculated ≥ critical value → difference is significant → reject the null hypothesis.

Critical Values Table (p = 0.05)

df	Critical value (p = 0.05)
1	3.84
2	5.99
3	7.82
4	9.49
5	11.07

For a 9:3:3:1 dihybrid cross, the value you need is 7.82. For a 3:1 monohybrid, it is 3.84.

Worked Example 1: Monohybrid Cross

A geneticist crosses two heterozygous pea plants (Tt × Tt) and counts 200 offspring: 142 tall and 58 short.

Expected (3:1 ratio)

Tall: 200 × 3/4 = 150
Short: 200 × 1/4 = 50

Calculate χ²

Class	O	E	O − E	(O − E)²	(O − E)²/E
Tall	142	150	−8	64	0.427
Short	58	50	8	64	1.280

χ² = 0.427 + 1.280 = 1.71

Interpret

df = 2 − 1 = 1. Critical value = 3.84. Since 1.71 < 3.84, the deviation is not significant. Accept H₀: the data are consistent with a 3:1 ratio.

Worked Example 2: Dihybrid Cross

Cross two heterozygous flies (RrYy × RrYy). Out of 160 offspring observed:

Round yellow: 95
Round green: 28
Wrinkled yellow: 25
Wrinkled green: 12

Expected (9:3:3:1 ratio)

Round yellow: 160 × 9/16 = 90
Round green: 160 × 3/16 = 30
Wrinkled yellow: 160 × 3/16 = 30
Wrinkled green: 160 × 1/16 = 10

Calculate χ²

Class	O	E	O − E	(O − E)²	(O − E)²/E
Round yellow	95	90	5	25	0.278
Round green	28	30	−2	4	0.133
Wrinkled yellow	25	30	−5	25	0.833
Wrinkled green	12	10	2	4	0.400

χ² = 0.278 + 0.133 + 0.833 + 0.400 = 1.644

Interpret

df = 4 − 1 = 3. Critical value = 7.82. Since 1.644 < 7.82, the deviation is not significant. Accept H₀: the data fit a 9:3:3:1 ratio.

Worked Example 3: Detecting Linkage

A dihybrid test cross (AaBb × aabb) in fruit flies is expected to give a 1:1:1:1 ratio. Out of 400 offspring:

AB (grey long): 170
ab (black vestigial): 160
Ab (grey vestigial): 35
aB (black long): 35

Expected (1:1:1:1)

Each class: 100

Calculate χ²

Class	O	E	(O − E)²/E
AB	170	100	49.0
ab	160	100	36.0
Ab	35	100	42.25
aB	35	100	42.25

χ² = 49.0 + 36.0 + 42.25 + 42.25 = 169.5

Interpret

df = 3. Critical value = 7.82. Since 169.5 ≫ 7.82, the deviation is very highly significant. Reject H₀: the data do not fit a 1:1:1:1 ratio — the genes are almost certainly linked. The excess of parentals (AB and ab) and deficit of recombinants (Ab and aB) is consistent with linkage.

Cross-over value = 70/400 × 100 = 17.5%.

Interpreting p Values

p	Interpretation
p > 0.05	Difference not significant — accept H₀
p = 0.05	Threshold — 5% chance of rejecting H₀ incorrectly
p < 0.05	Significant — reject H₀
p < 0.01	Highly significant
p < 0.001	Very highly significant

Biologists conventionally use p = 0.05 as the cut-off: there is less than a 5% probability that the observed deviation could have arisen by chance if H₀ were true.

Exam Tip

Always state the null hypothesis explicitly before calculating χ². Show your table clearly with columns for O, E, O − E, (O − E)² and (O − E)²/E. Do not forget to calculate degrees of freedom (number of classes minus 1), look up the critical value at p = 0.05, and explicitly compare your calculated χ² to it. End with a sentence interpreting the result in biological terms — "the data fit a 9:3:3:1 ratio, consistent with independent assortment" or "the data do not fit a 1:1:1:1 ratio, suggesting linkage". Do not say "accept the alternative hypothesis" — you only accept or reject the null.

Common Exam Mistakes

Using chi-squared on continuous data. It is only for counts in discrete categories.
Using percentages instead of actual counts. You must use the real numbers.
Getting df wrong. df = number of categories − 1 (not offspring − 1).
Looking up the wrong critical value. Use the row for your df and the column for p = 0.05 (or whichever p you are asked to use).
Forgetting to sum over all categories. The chi-squared value is the total of (O − E)²/E across every class.
Saying "accept the null hypothesis" means "H₀ is true". It means "we have no evidence to reject H₀" — a subtle but important difference.
Calculating (O − E)² as O² − E². That is wrong; it is (O − E) then squared.
Forgetting to square. The numerator is (O − E)², not |O − E|.

Quick Recap

The chi-squared test compares observed and expected counts to decide whether a deviation is significant.
Formula: $\chi^2 = \sum \dfrac{(O - E)^2}{E}$
Degrees of freedom = number of categories − 1.
Compare calculated χ² to the critical value at p = 0.05.
χ² < critical value → accept H₀ (not significant).
χ² ≥ critical value → reject H₀ (significant deviation, suggesting linkage, epistasis or another factor).
Always state H₀, show your working in a table, give df, compare to the critical value and interpret biologically.

Why Chi-Squared Works: A Bit of Background

The chi-squared test is built on the idea that if the observed numbers really do come from the expected distribution, the deviations (O − E) will be roughly normally distributed with a standard deviation proportional to √E. Squaring the deviations and dividing by E gives each term units of "standard deviations squared", and summing them gives a quantity (χ²) whose distribution is known and tabulated.

This is why the test:

Uses squared deviations (so positive and negative differences both count, and large deviations are weighted more).
Divides by E (because a deviation of 5 is big when you expected 10 but trivial when you expected 10,000).
Needs reasonably large expected counts (so the normal approximation is valid — conventionally E ≥ 5 in every category).
Uses degrees of freedom equal to the number of categories minus 1 (because once you know all categories but one, the last is fixed by the total).

You do not need to prove any of this, but knowing why the formula has the shape it has will help you remember it correctly under exam pressure.

Worked Example 4: Sex-Linked Test

A geneticist crosses white-eyed (w) female fruit flies (XʷXʷ) with red-eyed (W) males (XᵂY) — a classic sex-linked cross. All F1 females should be red-eyed heterozygotes (XᵂXʷ) and all F1 males should be white-eyed (XʷY). Crossing F1 × F1:

Expected in F2 for 800 flies:

Red-eyed females (XᵂXʷ and XᵂXᵂ): 400
Red-eyed males (XᵂY): 200
White-eyed females (XʷXʷ): 0 (actually — mix is 1:1:1:1 among the four classes: red F, white F, red M, white M). Let me redo.

Actually the four classes are XᵂXʷ (red F), XʷXʷ (white F), XᵂY (red M), XʷY (white M) in a 1:1:1:1 ratio, so out of 800: 200 each.

Suppose the observed numbers are 210 red F, 195 white F, 202 red M, 193 white M.

Class	O	E	(O−E)²/E
Red F	210	200	0.50
White F	195	200	0.125
Red M	202	200	0.02
White M	193	200	0.245

χ² = 0.89. df = 3. Critical value at p = 0.05 is 7.82. Since 0.89 ≪ 7.82, accept H₀: the data are consistent with the expected 1:1:1:1 ratio.

Chi-Squared Test

Chi-Squared Test

When to Use the Chi-Squared Test

The Procedure

Step 1: State the null hypothesis

Step 2: Calculate the expected values

Step 3: Apply the formula

Step 4: Work out degrees of freedom

Step 5: Compare to the critical value

Critical Values Table (p = 0.05)

Worked Example 1: Monohybrid Cross

Expected (3:1 ratio)

Calculate χ²

Interpret

Worked Example 2: Dihybrid Cross

Expected (9:3:3:1 ratio)

Calculate χ²

Interpret

Worked Example 3: Detecting Linkage

Expected (1:1:1:1)

Calculate χ²

Interpret

Interpreting p Values

Exam Tip

Common Exam Mistakes

Quick Recap

Why Chi-Squared Works: A Bit of Background

Worked Example 4: Sex-Linked Test

More in Biology