You are viewing a free preview of this lesson.
Subscribe to unlock all 12 lessons in this course and every other course on LearningBro.
By the end of this lesson you should be able to explain and apply each part of this topic — When to Use the Chi-Squared Test, The Procedure, Critical Values Table (p = 0.05) and Interpreting p Values — and use these ideas accurately in exam-style questions.
Spec Mapping — OCR H420 Module 6.1.2 — Patterns of inheritance, content statements covering the chi-squared (χ²) test for analysing genetic crosses, including calculating the statistic, identifying degrees of freedom, and interpreting the result using critical values and p-values (refer to the official OCR H420 specification document for exact wording). The chi-squared test is the formal statistical tool you have already used informally in the linkage, epistasis and dihybrid lessons — here it is treated as the OCR Mathematical Requirement M1.9 in its own right.
The chi-squared test was developed by Karl Pearson in 1900 (paraphrased) as part of the founding of modern statistics. He showed that for count data drawn from a known distribution, the sum of squared deviations divided by expected counts follows a known distribution (the chi-squared distribution) whose tail probabilities are tabulated. Pearson's contribution made it possible for the first time to assess quantitatively whether observed data deviate significantly from a theoretical prediction — a foundational tool in biology, medicine, and the natural sciences.
You have carried out a genetic cross and counted the offspring. The numbers look roughly like a 9:3:3:1 ratio — but not exactly. Is the deviation just chance, or is it real and telling you that Mendel's law does not apply (because of linkage or epistasis, say)? The chi-squared (χ²) test is the statistical tool that lets you answer this question quantitatively. OCR A-Level Biology A specification module 6.1.2(g) requires you to use the chi-squared test to analyse results from genetic crosses, including calculating the value, using degrees of freedom and critical values, and interpreting the outcome.
Key Definitions:
- Null hypothesis (H₀) — there is no significant difference between observed and expected values; any difference is due to chance.
- Observed (O) — the actual number in each category.
- Expected (E) — the number predicted by the null hypothesis.
- Degrees of freedom (df) — the number of categories minus 1.
- Critical value — the tabulated χ² value above which the null hypothesis is rejected at a chosen probability (usually p = 0.05).
- p-value — the probability that the observed deviation (or a more extreme one) would occur by chance alone if H₀ were true.
Chi-squared is used for categorical data (counts, not measurements) arranged in discrete classes. In genetics, typical applications are:
You should not use chi-squared on percentages or on data with very small expected counts (conventionally, each expected value should be at least 5).
H₀: the observed offspring numbers fit the expected Mendelian ratio (e.g. 9:3:3:1). Any difference is due to chance.
Multiply the total number of offspring by the proportion expected in each class. For a 9:3:3:1 ratio with 160 offspring, expected values are 90, 30, 30, 10.
χ2=∑E(O−E)2
For each category, compute (O − E)², divide by E, and sum across all categories.
df=n−1
where n is the number of categories. For a 9:3:3:1 cross with 4 categories, df = 3. For a 3:1 monohybrid, df = 1.
Look up the critical value for df and p = 0.05 (the usual significance threshold in biology).
| df | Critical value (p = 0.05) |
|---|---|
| 1 | 3.84 |
| 2 | 5.99 |
| 3 | 7.82 |
| 4 | 9.49 |
| 5 | 11.07 |
For a 9:3:3:1 dihybrid cross, the value you need is 7.82. For a 3:1 monohybrid, it is 3.84.
A geneticist crosses two heterozygous pea plants (Tt × Tt) and counts 200 offspring: 142 tall and 58 short.
| Class | O | E | O − E | (O − E)² | (O − E)²/E |
|---|---|---|---|---|---|
| Tall | 142 | 150 | −8 | 64 | 0.427 |
| Short | 58 | 50 | 8 | 64 | 1.280 |
χ² = 0.427 + 1.280 = 1.71
df = 2 − 1 = 1. Critical value = 3.84. Since 1.71 < 3.84, the deviation is not significant. Accept H₀: the data are consistent with a 3:1 ratio.
Cross two heterozygous flies (RrYy × RrYy). Out of 160 offspring observed:
| Class | O | E | O − E | (O − E)² | (O − E)²/E |
|---|---|---|---|---|---|
| Round yellow | 95 | 90 | 5 | 25 | 0.278 |
| Round green | 28 | 30 | −2 | 4 | 0.133 |
| Wrinkled yellow | 25 | 30 | −5 | 25 | 0.833 |
| Wrinkled green | 12 | 10 | 2 | 4 | 0.400 |
χ² = 0.278 + 0.133 + 0.833 + 0.400 = 1.644
df = 4 − 1 = 3. Critical value = 7.82. Since 1.644 < 7.82, the deviation is not significant. Accept H₀: the data fit a 9:3:3:1 ratio.
A dihybrid test cross (AaBb × aabb) in fruit flies is expected to give a 1:1:1:1 ratio. Out of 400 offspring:
| Class | O | E | (O − E)²/E |
|---|---|---|---|
| AB | 170 | 100 | 49.0 |
| ab | 160 | 100 | 36.0 |
| Ab | 35 | 100 | 42.25 |
| aB | 35 | 100 | 42.25 |
χ² = 49.0 + 36.0 + 42.25 + 42.25 = 169.5
df = 3. Critical value = 7.82. Since 169.5 ≫ 7.82, the deviation is very highly significant. Reject H₀: the data do not fit a 1:1:1:1 ratio — the genes are almost certainly linked. The excess of parentals (AB and ab) and deficit of recombinants (Ab and aB) is consistent with linkage.
Cross-over value = 70/400 × 100 = 17.5%.
| p | Interpretation |
|---|---|
| p > 0.05 | Difference not significant — accept H₀ |
| p = 0.05 | Threshold — 5% chance of rejecting H₀ incorrectly |
| p < 0.05 | Significant — reject H₀ |
| p < 0.01 | Highly significant |
| p < 0.001 | Very highly significant |
Biologists conventionally use p = 0.05 as the cut-off: there is less than a 5% probability that the observed deviation could have arisen by chance if H₀ were true.
Always state the null hypothesis explicitly before calculating χ². Show your table clearly with columns for O, E, O − E, (O − E)² and (O − E)²/E. Do not forget to calculate degrees of freedom (number of classes minus 1), look up the critical value at p = 0.05, and explicitly compare your calculated χ² to it. End with a sentence interpreting the result in biological terms — "the data fit a 9:3:3:1 ratio, consistent with independent assortment" or "the data do not fit a 1:1:1:1 ratio, suggesting linkage". Do not say "accept the alternative hypothesis" — you only accept or reject the null.
The chi-squared test is built on the idea that if the observed numbers really do come from the expected distribution, the deviations (O − E) will be roughly normally distributed with a standard deviation proportional to √E. Squaring the deviations and dividing by E gives each term units of "standard deviations squared", and summing them gives a quantity (χ²) whose distribution is known and tabulated.
This is why the test:
You do not need to prove any of this, but knowing why the formula has the shape it has will help you remember it correctly under exam pressure.
A geneticist crosses white-eyed (w) female fruit flies (XʷXʷ) with red-eyed (W) males (XᵂY) — a classic sex-linked cross. All F1 females should be red-eyed heterozygotes (XᵂXʷ) and all F1 males should be white-eyed (XʷY). Crossing F1 × F1:
Expected in F2 for 800 flies:
The four classes are XᵂXʷ (red F), XʷXʷ (white F), XᵂY (red M), XʷY (white M) in a 1:1:1:1 ratio, so out of 800: 200 each.
Suppose the observed numbers are 210 red F, 195 white F, 202 red M, 193 white M.
| Class | O | E | (O−E)²/E |
|---|---|---|---|
| Red F | 210 | 200 | 0.50 |
| White F | 195 | 200 | 0.125 |
| Red M | 202 | 200 | 0.02 |
| White M | 193 | 200 | 0.245 |
χ² = 0.89. df = 3. Critical value at p = 0.05 is 7.82. Since 0.89 ≪ 7.82, accept H₀: the data are consistent with the expected 1:1:1:1 ratio.
The chi-squared distribution depends only on the degrees of freedom (df). Critical values at p = 0.05 increase with df because more degrees of freedom permit larger total deviations under the null hypothesis. The OCR exam table provides values at p = 0.05; you should know that p = 0.05 is the conventional threshold and what it means: "if H₀ is true, there is less than a 5% probability that we would see a deviation as large or larger by chance".
The chi-squared statistic is:
χ2=∑E(O−E)2
with degrees of freedom:
df=n−1
where n is the number of phenotypic categories.
For a Mendelian 3:1 monohybrid cross, df = 2 − 1 = 1, critical value at p = 0.05 is 3.84. For a Mendelian 9:3:3:1 dihybrid cross, df = 4 − 1 = 3, critical value at p = 0.05 is 7.82. For a recessive-epistatic 9:3:4 cross, df = 3 − 1 = 2, critical value at p = 0.05 is 5.99. For a dominant-epistatic 12:3:1 cross, df = 3 − 1 = 2, critical value at p = 0.05 is 5.99. For a complementary 9:7 cross, df = 2 − 1 = 1, critical value at p = 0.05 is 3.84.
Synoptic Links — Connects to:
ocr-alevel-biology-genetics-inheritance / phenotypic-variation-monogenic-inheritance(monohybrid 3:1 ratio — the simplest Mendelian application of chi-squared).ocr-alevel-biology-genetics-inheritance / dihybrid-crosses(9:3:3:1 — most common dihybrid application).ocr-alevel-biology-genetics-inheritance / autosomal-linkage-crossing-over(chi-squared on test-cross data reveals linkage when the observed deviation from 1:1:1:1 is significant).ocr-alevel-biology-genetics-inheritance / epistasis(modified ratios — recessive epistasis 9:3:4 with df = 2, dominant 12:3:1 with df = 2, complementary 9:7 with df = 1).ocr-alevel-biology-biodiversity-evolution / biodiversity(chi-squared has wider biological application — testing whether observed species distribution matches expected, whether allele frequencies remain in Hardy-Weinberg equilibrium, etc.).
Subscribe to continue reading
Get full access to this lesson and all 12 lessons in this course.