Hypothesis Testing for the Normal Distribution

This lesson covers hypothesis testing for a population mean using the normal distribution as required by the Edexcel A-Level Mathematics specification (9MA0), Paper 3 Section A -- Statistics. You need to test whether a sample provides evidence of a change in the population mean, using a test statistic and critical values.

When to Use a Normal Distribution Test

Use this test when:

The population is normally distributed (or large sample -- Central Limit Theorem).
The population standard deviation (sigma) is known.
You are testing a claim about the population mean (mu).

The test statistic

Z = (X-bar - mu0) / (sigma / sqrt(n))

where X-bar is the sample mean, mu0 is the hypothesised mean, sigma is the known population SD, and n is the sample size.

Setting Up the Hypotheses

H0: mu = mu0
H1: mu < mu0 (one-tailed, lower)
H1: mu > mu0 (one-tailed, upper)
H1: mu ≠ mu0 (two-tailed)

The Standard Error

Standard error = sigma / sqrt(n)

As n increases, the standard error decreases -- larger samples give more precise estimates.

One-Tailed Test (Lower) -- Worked Example

A manufacturer claims mean lifetime is 1200 hours (sigma = 100). A sample of 25 gives mean 1160 hours. Test at 5%.

H0: mu = 1200, H1: mu < 1200.

SE = 100/sqrt(25) = 20. Z = (1160 - 1200)/20 = -2.0.

Critical value (5%, one-tailed): -1.6449.

-2.0 < -1.6449 --> reject H0. Sufficient evidence the mean is less than 1200 hours.

p-value: P(Z < -2.0) = 0.0228 < 0.05. Same conclusion.

One-Tailed Test (Upper) -- Worked Example

Old mean score was 65 (sigma = 10). New method with n = 40 gives mean 68. Test at 5%.

H0: mu = 65, H1: mu > 65.

SE = 10/sqrt(40) = 1.581. Z = (68 - 65)/1.581 = 1.897.

Critical value: 1.6449. 1.897 > 1.6449 --> reject H0. Sufficient evidence the mean has increased.

Two-Tailed Test -- Worked Example

A company claims mean volume is 330 ml (sigma = 5). Inspector tests 50 bottles, gets mean 328.5. Test at 1%.

H0: mu = 330, H1: mu ≠ 330. alpha/2 = 0.005.

SE = 5/sqrt(50) = 0.707. Z = (328.5 - 330)/0.707 = -2.121.

Critical values: +/-2.5758. -2.121 is between -2.5758 and 2.5758.

Do not reject H0. Insufficient evidence at 1% level that the mean differs from 330 ml.

Critical Values Summary

Significance level	One-tailed	Two-tailed
10%	+/-1.2816	+/-1.6449
5%	+/-1.6449	+/-1.9600
2.5%	+/-1.9600	+/-2.2414
1%	+/-2.3263	+/-2.5758

Exam Tip: Memorise z = 1.6449 (5% one-tailed), z = 1.9600 (5% two-tailed), z = 2.5758 (1% two-tailed).

Testing with a Single Observation

When n = 1: Z = (x - mu0) / sigma (standard error = sigma).

Example

Fish weight ~ N(2.5, 0.09). A fish weighs 3.2 kg. Test at 5% whether mean is higher.

Z = (3.2 - 2.5)/0.3 = 2.333. Critical value 1.6449. 2.333 > 1.6449 --> reject H0.

Interpreting Results in Context

Always relate the conclusion to the original context. Include:

Whether you reject or do not reject H0.
The significance level.
A contextual statement about the population parameter.

Good examples:

"There is sufficient evidence at the 5% significance level to conclude that the new fertiliser increases the mean yield of wheat."

"There is insufficient evidence at the 1% significance level to conclude that the mean delivery time has changed from 3 days."

Connecting to the Binomial Hypothesis Test

Both tests follow the same five-step framework:

State H0 and H1.
State the significance level.
Calculate the test statistic or p-value.
Compare with critical value or alpha.
Conclude in context.

The difference:

Binomial: tests a proportion p. Uses X ~ B(n, p) under H0.
Normal: tests a mean mu. Uses Z = (X-bar - mu0) / (sigma/sqrt(n)).

Summary

Test statistic: Z = (X-bar - mu0) / (sigma/sqrt(n)).
Standard error = sigma/sqrt(n).
Compare Z with critical value, or p-value with alpha.
One-tailed 5%: +/-1.6449. Two-tailed 5%: +/-1.9600.
Always conclude in context. Never say "accept H0".

A-Level Deep Dive: Hypothesis Testing for the Normal Distribution

Spec mapping

Edexcel 9MA0-03 specification section 8 — Statistical hypothesis testing, sub-strands 8.1, 8.2 and 8.3 covers conduct a statistical hypothesis test for the mean of a normal distribution with known, given or assumed variance and interpret the results in context (refer to the official specification document for exact wording).; "Extend ideas of hypothesis testing to test for zero correlation"; "Understand and apply the language of statistical hypothesis testing." This sub-strand sits in Paper 3 — Statistics and Mechanics but builds directly on Year 1 Section 6 (Statistical distributions, Normal distribution) and Year 1 Section 7 (Hypothesis testing for the binomial proportion). The Edexcel formula booklet provides the standard normal CDF tables and the critical-value tables for the product-moment correlation coefficient (PMCC); the Z-test statistic itself must be constructed by the candidate.

Worked example with full mark scheme

Question (8 marks):

A factory fills jars labelled 500 g. The fill weight $X$ grams is modelled as $X \sim N(\mu, 4^2)$ with $\sigma = 4$ taken as known from long-run process data. A random sample of $n = 25$ jars has mean $\bar{x} = 502.1$ g. Test, at the 5% significance level, whether the mean fill weight differs from 500 g. (8)

Solution with mark scheme:

Step 1 — state hypotheses.

$H_0: \mu = 500$ , $H_1: \mu \neq 500$ (two-tailed, since we are testing whether the mean differs from 500).

B1 — both hypotheses correctly stated, in terms of the population mean $\mu$ (not $\bar{x}$ ). A common slip is writing $H_0: \bar{x} = 500$ — the hypothesis is always about the parameter, never about the sample statistic.

Step 2 — state the distribution of the sample mean under $H_0$ .

Under $H_0$ , $\bar{X} \sim N\left(500, \dfrac{4^2}{25}\right) = N(500, 0.64)$ , so the standard error is $\sigma/\sqrt{n} = 4/5 = 0.8$ .

M1 — correct sampling distribution of the mean with variance $\sigma^2/n$ , not $\sigma^2$ .

Step 3 — compute the test statistic.

$z = \dfrac{\bar{x} - \mu_0}{\sigma/\sqrt{n}} = \dfrac{502.1 - 500}{0.8} = \dfrac{2.1}{0.8} = 2.625$

M1 — substitution into the Z-formula. A1 — value $z = 2.625$ to at least 3 s.f.

Step 4 — identify the critical region.

Two-tailed test at 5%: critical values are $\pm z_{0.025} = \pm 1.96$ . Reject $H_0$ if $|z| > 1.96$ .

B1 — correct critical value(s) with the two-tailed split ( $2.5\%$ in each tail, not $5\%$ ).

Step 5 — compare and conclude.

Since $2.625 > 1.96$ , the test statistic lies in the critical region. Reject $H_0$ .

M1 — valid comparison of $z$ with the critical value (or equivalently $p$ with $\alpha$ ).

A1 — conclusion in context: "There is sufficient evidence at the 5% level to suggest that the mean fill weight differs from 500 g."

A1 — explicit reference to context (jars / fill weight), not a bare "reject $H_0$ ".

Total: 8 marks (B2 M3 A3, split as shown).

Specimen question modelled on the Edexcel 9MA0 Paper 3 format

Question (6 marks): The reaction time $T$ seconds of trained athletes to a starting signal is modelled as $T \sim N(\mu, 0.05^2)$ . A coach claims a new training programme reduces the mean reaction time below the historical value of $0.18$ s. A sample of $n = 16$ athletes after the programme gives $\bar{t} = 0.165$ s. Test the coach's claim at the 1% significance level.

Mark scheme decomposition by AO:

B1 (AO2.5) — $H_0: \mu = 0.18$ , $H_1: \mu < 0.18$ (one-tailed lower).
M1 (AO1.1b) — sampling distribution $\bar{T} \sim N(0.18, 0.05^2/16)$ under $H_0$ , standard error $0.0125$ .
M1 (AO1.1b) — test statistic $z = (0.165 - 0.18)/0.0125 = -1.2$ .
B1 (AO1.1a) — critical value for one-tailed 1% lower test is $z = -2.3263$ .
M1 (AO2.2a) — comparison: $-1.2 > -2.3263$ , so $z$ is not in the critical region.
A1 (AO3.2a) — conclusion in context: "Insufficient evidence at the 1% level that the new programme reduces mean reaction time below 0.18 s."

Total: 6 marks split AO1 = 3, AO2 = 2, AO3 = 1. Notice the AO3 mark for a contextual conclusion — examiners reserve AO3 marks for the candidate who connects the statistical decision back to the original claim under test.

Synoptic links

Connects to:

Year 1 Section 6 — Normal distribution: the test relies on the linearity property $\bar{X} = \tfrac{1}{n}\sum X_i \sim N(\mu, \sigma^2/n)$ , which is exact when each $X_i$ is normal. Computing $P(\bar{X} > 502.1)$ uses the standard normal table from Year 1.
Year 1 Section 7 — Binomial hypothesis test: the logic — null/alternative, significance level, critical region, conclusion — is identical. Only the test statistic and its distribution change. A common Paper 3 stem mixes both: a binomial test in (a), a normal-mean test in (b).
Year 2 Section 4 — Correlation: the PMCC test $H_0: \rho = 0$ vs $H_1: \rho \neq 0$ uses critical $r$ -values from the formula booklet. Same hypothesis-testing framework, different statistic.
Central limit theorem (background, not examined directly): even when $X_i$ is not normal, $\bar{X}$ is approximately normal for large $n$ . The 9MA0 spec assumes normality is given — but understanding why the test still works for non-normal $X$ is the conceptual bridge to A-level Further Maths and undergraduate statistics.
Statistical inference (whole topic): the Z-test is the simplest example of a likelihood-based decision rule. Confidence intervals $\bar{x} \pm z^* \cdot \sigma/\sqrt{n}$ are the dual of the test — accepting $H_0$ at level $\alpha$ is equivalent to $\mu_0$ lying inside the $(1-\alpha)$ confidence interval.

Mark-scheme literacy

Hypothesis-testing questions on 9MA0-03 split AO marks roughly as:

AO	Typical share	Earned by
AO1 (knowledge / procedure)	50–60%	Stating hypotheses correctly, computing the standard error, evaluating the Z statistic, identifying critical values from tables
AO2 (reasoning / interpretation)	25–35%	Choosing one- vs two-tailed; comparing test statistic to critical region; recognising the sampling distribution
AO3 (problem-solving / modelling)	10–20%	Conclusion in context; assessing whether the modelling assumption (known $\sigma$ , normal $X$ ) is reasonable

Examiner-rewarded phrasing: "Under $H_0$ , $\bar{X} \sim N(\mu_0, \sigma^2/n)$ "; "There is sufficient/insufficient evidence at the $\alpha\%$ level to support the claim that …"; "The test statistic lies in / does not lie in the critical region." Phrases that lose marks: "accept $H_0$ " (the correct phrase is do not reject); " $H_0$ is true" (we never prove $H_0$ ); a bare "reject $H_0$ " with no reference to the original context (loses the contextual A1).

A specific Edexcel pattern: when the question gives the significance level as a percentage and the test is two-tailed, candidates must split the level — $5\%$ two-tailed means $2.5\%$ in each tail, critical values $\pm 1.96$ . Forgetting to halve is the single most-penalised slip.

Grade-band model answers

3-mark question

Question: The IQ scores of pupils at a school are modelled as $N(\mu, 15^2)$ . A sample of $n = 36$ pupils has mean $\bar{x} = 105$ . State the distribution of $\bar{X}$ under $H_0: \mu = 100$ and compute the test statistic $z$ .

Grade C response (~180 words):

Under $H_0$ , the sample mean is normally distributed with mean 100 and variance $15^2/36 = 6.25$ . So $\bar{X} \sim N(100, 6.25)$ .

The test statistic is $z = (105 - 100)/\sqrt{6.25} = 5/2.5 = 2$ .

Hypothesis Testing for the Normal Distribution

Hypothesis Testing for the Normal Distribution

When to Use a Normal Distribution Test

The test statistic

Setting Up the Hypotheses

The Standard Error

One-Tailed Test (Lower) -- Worked Example

One-Tailed Test (Upper) -- Worked Example

Two-Tailed Test -- Worked Example

Critical Values Summary

Testing with a Single Observation

Example

Interpreting Results in Context

Connecting to the Binomial Hypothesis Test

Summary

A-Level Deep Dive: Hypothesis Testing for the Normal Distribution

Spec mapping

Worked example with full mark scheme

Specimen question modelled on the Edexcel 9MA0 Paper 3 format

Synoptic links

Mark-scheme literacy

Grade-band model answers

3-mark question

More in Mathematics