You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
The chi-squared (χ²) test is a statistical test used in genetics to determine whether observed results differ significantly from expected results. It allows you to assess whether deviations from predicted Mendelian ratios are due to chance alone or whether some other factor (such as linkage or epistasis) is operating. The Edexcel specification requires you to carry out chi-squared calculations and interpret the results correctly.
When you perform a genetic cross and observe the offspring, the results rarely match the expected ratio exactly. For example, a cross expected to produce a 3:1 ratio might give 82 dominant : 18 recessive instead of a perfect 75:25 out of 100.
The question is: is this deviation from the expected ratio due to chance (random variation in small sample sizes), or does it indicate that the expected ratio is wrong (suggesting linkage, epistasis, or another factor)?
The chi-squared test provides a formal, objective method to answer this question.
Every chi-squared test begins with a null hypothesis (H₀), which states that there is no significant difference between the observed and expected results — any deviation is due to chance.
For a monohybrid cross: "There is no significant difference between the observed phenotypic ratio and the expected 3:1 ratio."
If the chi-squared test shows the deviation is too large to be due to chance, we reject the null hypothesis and conclude that the expected ratio does not fit the data.
Exam tip: Always write out the null hypothesis explicitly. Mark schemes award a mark for stating H₀ clearly.
χ2=∑E(O−E)2
Where:
flowchart TD
A["State the null hypothesis"] --> B["Calculate expected values from the predicted ratio"]
B --> C["For each category: calculate (O − E)² ÷ E"]
C --> D["Sum all values to get χ²"]
D --> E["Calculate degrees of freedom: df = n − 1"]
E --> F["Look up critical value at p = 0.05"]
F --> G{"Is χ² > critical value?"}
G -->|"Yes"| H["Reject null hypothesis: significant difference"]
G -->|"No"| I["Accept null hypothesis: no significant difference"]
A cross between two heterozygous tall pea plants (Tt × Tt) is expected to produce a 3:1 ratio of tall : dwarf. From 120 offspring, the observed results are:
| Phenotype | Observed (O) | Expected (E) | O − E | (O − E)² | (O − E)² ÷ E |
|---|---|---|---|---|---|
| Tall | 98 | 90 | 8 | 64 | 0.711 |
| Dwarf | 22 | 30 | −8 | 64 | 2.133 |
| Total | 120 | 120 | χ² = 2.844 |
Expected values: 3/4 × 120 = 90 tall; 1/4 × 120 = 30 dwarf
Degrees of freedom: df = number of categories − 1 = 2 − 1 = 1
Critical value at p = 0.05 with 1 df: 3.841
Conclusion: χ² (2.844) < critical value (3.841), so we do not reject the null hypothesis. The difference between observed and expected is not statistically significant. The results are consistent with a 3:1 ratio.
A dihybrid cross (expected 9:3:3:1 ratio) produces the following offspring from 160 individuals:
| Phenotype | Observed (O) | Expected (E) | O − E | (O − E)² | (O − E)² ÷ E |
|---|---|---|---|---|---|
| A_B_ | 100 | 90 | 10 | 100 | 1.111 |
| A_bb | 28 | 30 | −2 | 4 | 0.133 |
| aaB_ | 22 | 30 | −8 | 64 | 2.133 |
| aabb | 10 | 10 | 0 | 0 | 0.000 |
| Total | 160 | 160 | χ² = 3.377 |
Expected values: 9/16 × 160 = 90, 3/16 × 160 = 30, 3/16 × 160 = 30, 1/16 × 160 = 10
Degrees of freedom: 4 − 1 = 3
Critical value at p = 0.05 with 3 df: 7.815
Conclusion: χ² (3.377) < 7.815, so we do not reject H₀. The 9:3:3:1 ratio is a good fit for the data.
A test cross produces the following offspring (expected 1:1:1:1 if genes are unlinked):
| Phenotype | Observed (O) | Expected (E) | O − E | (O − E)² | (O − E)² ÷ E |
|---|---|---|---|---|---|
| AB | 180 | 100 | 80 | 6400 | 64.0 |
| ab | 170 | 100 | 70 | 4900 | 49.0 |
| Ab | 30 | 100 | −70 | 4900 | 49.0 |
| aB | 20 | 100 | −80 | 6400 | 64.0 |
| Total | 400 | 400 | χ² = 226.0 |
Degrees of freedom: 4 − 1 = 3
Critical value at p = 0.05 with 3 df: 7.815
Conclusion: χ² (226.0) >> 7.815, so we reject the null hypothesis. The data do not fit a 1:1:1:1 ratio. The excess of parental types (AB and ab) and deficit of recombinant types (Ab and aB) strongly suggest the genes are linked.
A cross expected to give 9:3:4 (recessive epistasis) yields 195 offspring:
| Phenotype | Observed (O) | Expected ratio | Expected (E) | (O − E)² ÷ E |
|---|---|---|---|---|
| Black | 108 | 9/16 | 109.7 | 0.026 |
| Chocolate | 36 | 3/16 | 36.6 | 0.010 |
| Golden | 51 | 4/16 | 48.8 | 0.099 |
| Total | 195 | 195 | χ² = 0.135 |
Degrees of freedom: 3 − 1 = 2. Critical value at p = 0.05 with 2 df: 5.991.
Conclusion: χ² (0.135) << 5.991. We do not reject H₀. The data are an excellent fit for the 9:3:4 epistasis ratio, supporting recessive epistasis.
You must be able to use a critical values table. Here is the portion you are most likely to need:
| Degrees of freedom (df) | p = 0.10 | p = 0.05 | p = 0.02 | p = 0.01 | p = 0.001 |
|---|---|---|---|---|---|
| 1 | 2.706 | 3.841 | 5.412 | 6.635 | 10.828 |
| 2 | 4.605 | 5.991 | 7.824 | 9.210 | 13.816 |
| 3 | 6.251 | 7.815 | 9.837 | 11.345 | 16.266 |
| 4 | 7.779 | 9.488 | 11.668 | 13.277 | 18.467 |
| 5 | 9.236 | 11.070 | 13.388 | 15.086 | 20.515 |
Exam tip: The significance level used in biology is almost always p = 0.05 (5%). This means there is a 5% probability that the deviation is due to chance. If χ² exceeds the critical value at p = 0.05, we reject H₀.
| p-value | Meaning |
|---|---|
| p > 0.05 | The difference is NOT significant. We accept H₀. Deviations are due to chance. |
| p < 0.05 | The difference IS significant. We reject H₀. Something other than chance is causing the deviation. |
| p < 0.01 | Highly significant — very strong evidence against H₀. |
| p < 0.001 | Extremely significant. |
flowchart TD
A["Calculate χ² value"] --> B["Compare with critical value at p = 0.05"]
B --> C{"χ² < critical value?"}
C -->|"Yes"| D["p > 0.05: Accept H₀<br/>No significant difference<br/>Data fits expected ratio"]
C -->|"No"| E["p < 0.05: Reject H₀<br/>Significant difference<br/>Expected ratio does not fit"]
E --> F["Investigate: linkage? epistasis? selection?"]
Degrees of freedom (df) = number of phenotypic categories − 1
| Cross type | Categories | df |
|---|---|---|
| Monohybrid (3:1) | 2 | 1 |
| Dihybrid (9:3:3:1) | 4 | 3 |
| Test cross (1:1) | 2 | 1 |
| Dihybrid test cross (1:1:1:1) | 4 | 3 |
| Epistasis (9:3:4) | 3 | 2 |
| Epistasis (9:7) | 2 | 1 |
| Codominance (1:2:1) | 3 | 2 |
The chi-squared test is valid only when:
Exam mistake to avoid: Never use percentages in a chi-squared calculation. Always convert to actual counts. If the question gives percentages, multiply by the sample size first.
| Situation | Expected ratio to test |
|---|---|
| Monohybrid cross (heterozygous × heterozygous) | 3:1 |
| Monohybrid test cross | 1:1 |
| Dihybrid cross (both heterozygous) | 9:3:3:1 |
| Dihybrid test cross | 1:1:1:1 |
| Codominance (heterozygous × heterozygous) | 1:2:1 |
| Suspected recessive epistasis | 9:3:4 |
| Suspected complementary genes | 9:7 |
| Suspected dominant epistasis | 12:3:1 |
The chi-squared test compares observed and expected frequencies to determine whether deviations are due to chance. The test requires a null hypothesis, the χ² formula, degrees of freedom, and a critical values table. If χ² exceeds the critical value at p = 0.05, the null hypothesis is rejected. This test is essential for confirming Mendelian ratios and detecting phenomena such as linkage and epistasis in genetic crosses.
The chi-squared (χ²) goodness-of-fit test is the central inferential tool of A-Level genetics — the formal bridge between observed progeny counts and the Mendelian ratios of Topics 4–7. Genetic crosses never produce the expected ratio exactly; sampling variation guarantees a deviation. The candidate's task is to ask, quantitatively, whether the deviation is plausibly explained by chance alone or is large enough to reject the null hypothesis of agreement with the predicted ratio. The test takes one numerical input — χ² = Σ(O − E)²/E — and one decision parameter — degrees of freedom (df) — and returns a binary verdict against a tabulated critical value at p = 0.05. The maths is light; the mark-scheme literacy is heavy. Examiners reward the disciplined six-line protocol — state H₀ → tabulate O and E → compute (O − E)²/E per category → sum → look up critical value at correct df → conclude in words against p = 0.05 — and penalise omissions ruthlessly. Every Mendelian ratio in lessons 4–7 is ultimately a null hypothesis that χ² either fails to reject (consistent with simple Mendelian inheritance) or rejects (flagging linkage, epistasis, lethality, or selection — diagnosed further in lessons 9 and 10).
This material sits in Edexcel 9BI0 Topic 8 (Grey Matter — Coordination, Response and Gene Technology) under the inheritance and statistics strand. Required content covers: the null hypothesis (H₀) stated explicitly; the formula χ² = Σ(O − E)²/E; expected values as N × ratio fraction (for 9 : 3 : 3 : 1, multiply total by 9/16, 3/16, 3/16, 1/16); df = categories − 1 for a single-sample goodness-of-fit; the critical-value table at p = 0.05 — 3.84 (df = 1), 5.99 (df = 2), 7.81 (df = 3), 9.49 (df = 4); the decision rule — reject H₀ at p = 0.05 when calculated χ² ≥ critical value, otherwise fail to reject; explicit conclusion phrasing in words tying H₀, χ², critical value, df, p threshold and biological interpretation together; the assumptions that data are counts (not percentages), that every expected count ≥ 5 and that observations are independent; and the role of the test in separating chance deviation from structural deviation flagging linkage, epistasis, codominance proportions or departures from Hardy–Weinberg. Synoptic links run back to lesson 4 (3 : 1 and 1 : 1), lesson 5 (9 : 3 : 3 : 1 and 1 : 1 : 1 : 1), lesson 6 (1 : 2 : 1) and lesson 7 (1 : 1 : 1 : 1 linkage diagnostic), and forwards to lesson 10 (Hardy–Weinberg goodness-of-fit). Refer to the official Pearson Edexcel 9BI0 specification document for exact wording.
Question (8 marks):
(a) A self-cross between two doubly heterozygous Pisum sativum plants (RrYy × RrYy) is performed. Mendel's second law predicts a 9 : 3 : 3 : 1 phenotypic ratio for Round-Yellow : Round-green : wrinkled-Yellow : wrinkled-green. From a total of 640 F₂ progeny, the observed counts are: Round-Yellow = 380, Round-green = 110, wrinkled-Yellow = 105, wrinkled-green = 45. State a suitable null hypothesis, derive the expected counts, perform a chi-squared goodness-of-fit test and reach an explicit conclusion at p = 0.05. (6)
(b) Explain, with reference to df, why the critical value at p = 0.05 for this cross is 7.81 and not 3.84. (1)
(c) State one assumption of the chi-squared test that the candidate must verify before reporting the verdict, and confirm whether the assumption holds for the data above. (1)
Solution with mark scheme:
(a) M1 (AO1.2) — null hypothesis. H₀: there is no significant difference between the observed phenotypic frequencies and those predicted by a 9 : 3 : 3 : 1 ratio; any deviation is due to chance.
A1 (AO2.2) — derive expected counts. With N = 640 and ratio fractions 9/16, 3/16, 3/16, 1/16:
| Category | Ratio fraction | Expected (E) = N × fraction |
|---|---|---|
| Round-Yellow | 9/16 | 640 × 9/16 = 360 |
| Round-green | 3/16 | 640 × 3/16 = 120 |
| wrinkled-Yellow | 3/16 | 640 × 3/16 = 120 |
| wrinkled-green | 1/16 | 640 × 1/16 = 40 |
| Total | 16/16 | 640 ✓ |
M1 (AO2.2) — full χ² table. Compute (O − E)²/E per category and sum:
| Category | O | E | O − E | (O − E)² | (O − E)²/E |
|---|---|---|---|---|---|
| Round-Yellow | 380 | 360 | +20 | 400 | 400/360 = 1.111 |
| Round-green | 110 | 120 | −10 | 100 | 100/120 = 0.833 |
| wrinkled-Yellow | 105 | 120 | −15 | 225 | 225/120 = 1.875 |
| wrinkled-green | 45 | 40 | +5 | 25 | 25/40 = 0.625 |
| Total | 640 | 640 | χ² = 4.444 |
A1 (AO2.2) — df and critical value. df = categories − 1 = 4 − 1 = 3. Critical value at p = 0.05, df = 3 = 7.81.
M1 (AO3.1) — decision rule. Calculated χ² = 4.444 < 7.81 (critical), so the calculated value does not reach the critical threshold.
A1 (AO3.2) — conclusion in words. We fail to reject the null hypothesis at p = 0.05. The observed counts are not significantly different from a 9 : 3 : 3 : 1 ratio. The data are consistent with two unlinked loci showing complete dominance and independent assortment — no need to invoke linkage, epistasis or selection.
(b) M1 (AO1.2) — df determines the critical value. Four categories give df = k − 1 = 3, not 1. The p = 0.05 critical for df = 3 is 7.81; 3.84 is the df = 1 value (2-category tests such as monohybrid 3 : 1).
(c) M1 (AO1.2) — assumption check. A required assumption is every E ≥ 5. Smallest E = 40 (wrinkled-green) ≥ 5, so the assumption holds and the χ² verdict is valid. (Also acceptable: counts not percentages; independence of observations.)
Total: 8 marks (M4 A4).
Question (6 marks): In a Drosophila dihybrid test cross of body colour and wing length, an unlinked-genes hypothesis predicts a 1 : 1 : 1 : 1 phenotypic ratio. From a total of 400 progeny the observed counts are AB = 165, ab = 155, Ab = 45, aB = 35. (i) State a suitable null hypothesis; (ii) carry out a chi-squared goodness-of-fit test on the data, showing the full O / E / (O − E)² / (O − E)²/E table; (iii) state df, look up the critical value at p = 0.05 and conclude whether the loci are linked.
Mark scheme decomposition by AO:
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.