You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
This lesson extends the hypothesis testing framework to include tests for the mean of a normal distribution and the correlation coefficient. These are key topics at A-Level and bring together the concepts of the normal distribution, sampling, and statistical inference.
If X∼N(μ,σ2) and σ2 is known, we can test hypotheses about μ using the sample mean Xˉ.
If X∼N(μ,σ2), then the sample mean of a random sample of size n follows:
Xˉ∼N(μ,nσ2)
The standard error of the mean is:
SE=nσ
State hypotheses:
Assume H0 is true: Xˉ∼N(μ0,nσ2)
Calculate the test statistic: z=σ/nxˉ−μ0
Compare with the critical value or find the p-value.
Make a decision and state the conclusion in context.
Example: A machine fills bottles with a mean of 500 ml (σ=5 ml). A sample of 25 bottles has a mean of 498.2 ml. Test at 5% whether the mean has decreased.
H0:μ=500, H1:μ<500 (one-tailed)
z=5/25498.2−500=1−1.8=−1.8
Critical value at 5% (one-tailed): z=−1.6449
Since −1.8<−1.6449, reject H0. There is sufficient evidence at the 5% level to suggest the mean volume has decreased.
Exam Tip: Always state the distribution of Xˉ under H0 before calculating the test statistic. This shows the examiner you understand the sampling distribution, which is worth method marks.
The Central Limit Theorem states that for a large sample (typically n≥30), the distribution of Xˉ is approximately normal regardless of the underlying distribution:
Xˉ∼N(μ,nσ2)(approximately)
This allows us to apply normal-based hypothesis tests even when the population is not normally distributed, provided the sample is large enough.
To test whether a population correlation coefficient ρ is significantly different from zero:
State hypotheses:
Calculate the sample PMCC r.
Compare ∣r∣ with the critical value from the PMCC table for the appropriate n and α.
State conclusion in context.
Example: A sample of 12 pairs of data gives r=0.65. Test at 5% whether there is positive correlation.
H0:ρ=0, H1:ρ>0 (one-tailed)
Critical value for n=12 at 5% (one-tailed): 0.4973
Since 0.65>0.4973, reject H0. There is sufficient evidence at the 5% level of positive correlation.
| Significance Level | Confidence Level | Meaning |
|---|---|---|
| 10% | 90% | Weak evidence required to reject H0 |
| 5% | 95% | Moderate evidence required (most common) |
| 1% | 99% | Strong evidence required |
Lower significance levels require stronger evidence to reject H0, reducing the risk of a Type I error but increasing the risk of a Type II error.
Exam Tip: The most common reason students lose marks in hypothesis testing questions is failing to give a contextual conclusion. After making your statistical decision, translate it into plain English: "There is sufficient evidence at the 5% significance level to suggest that the average weight of apples has decreased."
AQA 7357 specification, Paper 3 — Statistics, Section S: Statistical Hypothesis Testing (Year 2 content) covers conduct a statistical hypothesis test for the mean of a normal distribution with known, given or assumed variance and interpret the results in context (refer to the official specification document for exact wording).; and "Understand and apply the language of statistical hypothesis testing developed through a binomial model: null hypothesis, alternative hypothesis, significance level, test statistic, 1-tail test, 2-tail test, critical value, critical region, acceptance region, p-value; extend to correlation coefficients as measures of how close data points lie to a straight line and be able to interpret a given correlation coefficient using a given p-value or critical value (calculation of correlation coefficients is excluded)." Section S sits alongside Section O (probability), Section P (statistical distributions, including the normal distribution), and Section R (sampling), and is examined principally on Paper 3 alongside Mechanics. Critical values for the standard normal distribution are provided in the AQA formulae and statistical tables booklet; critical values of the product-moment correlation coefficient are also tabulated.
Question (8 marks):
A machine fills bottles with mineral water. The volume X ml dispensed per bottle is modelled as X∼N(μ,42), where the standard deviation σ=4 ml is known from the manufacturer's calibration data. The machine is set so that μ=500 ml. After a maintenance visit, a quality engineer suspects the mean volume has changed. A random sample of n=25 bottles gives a sample mean of xˉ=498.4 ml.
Test, at the 5% significance level, whether the mean volume has changed.
Solution with mark scheme:
Step 1 — state hypotheses.
Let μ denote the population mean volume after maintenance. Then:
H0:μ=500,H1:μ=500
B1 — both hypotheses correct, with μ defined as a population parameter (not sample mean) and H1 two-tailed (the engineer suspects a change in either direction).
Step 2 — state distribution of the sample mean under H0.
Under H0, since X∼N(500,16) with σ known:
Xˉ∼N(500, 2516)=N(500,0.64)
so the standard error is σ/n=4/5=0.8.
M1 — quoting the sampling distribution of Xˉ with the correct variance σ2/n. The single most common error is using σ2 instead of σ2/n — that costs the M1 and propagates a wrong z-value through the rest of the question.
Step 3 — compute the test statistic.
z=σ/nxˉ−μ0=0.8498.4−500=0.8−1.6=−2.00
M1 — correct standardisation formula applied.
A1 — z=−2.00 to at least 2 d.p. Sign matters: an answer of +2.00 (forgetting which way round subtraction goes) loses this A1 and may invalidate the conclusion.
Step 4 — compare with critical value.
For a two-tailed test at the 5% level, the critical values are z=±1.96. The critical region is ∣z∣>1.96.
B1 — correct critical value ±1.96 for a two-tailed 5% test. Candidates who quote ±1.645 (the one-tailed 5% value) lose this mark.
Step 5 — make a decision.
Since ∣z∣=2.00>1.96, the test statistic falls inside the critical region. Reject H0.
M1 — explicit comparison of test statistic with critical value, stating decision in symbolic form.
Step 6 — conclude in context.
There is sufficient evidence at the 5% significance level to suggest that the population mean volume has changed from 500 ml after the maintenance visit. The sample evidence (mean 498.4 ml) is consistent with a slight under-fill.
A1 — conclusion stated in context of the original problem (mineral-water filling), referring to the population mean, with appropriate tentative language ("evidence to suggest", not "proves"). This is the most frequently lost mark on hypothesis-test questions.
Total: 8 marks (B1 M1 M1 A1 B1 M1 A1, with the second A1 carrying the contextual conclusion).
Question (6 marks): A psychologist measures the product-moment correlation coefficient between hours of sleep and reaction-time score for a random sample of n=20 adults. The sample value is r=−0.524. The critical value of the PMCC at the 1% one-tailed significance level for n=20 is given in the table as 0.5155.
Test, at the 1% significance level, whether there is evidence of negative correlation between hours of sleep and reaction-time score in the underlying population. (6)
Mark scheme decomposition by AO:
Total: 6 marks split AO1 = 2, AO2 = 3, AO3 = 1. PMCC tests are unusual in that calculation of r is excluded from AQA — the entire question rests on interpretation, comparison, and contextual reasoning, which pushes AO2/AO3 weight unusually high.
Connects to:
Section P — The normal distribution: the Z-test for a mean is an application of the normal distribution to the sampling distribution of Xˉ. Confidence in standardising Z=(X−μ)/σ extends directly to Z=(Xˉ−μ)/(σ/n) — same formula, different denominator.
Section R — Statistical sampling and the sampling distribution of Xˉ: the result Xˉ∼N(μ,σ2/n) when X is normal (or by the Central Limit Theorem when n is large) is the engine of the Z-test. Without the CLT, the Z-test would only apply to genuinely normal populations.
Section S — Hypothesis test for a binomial proportion (Year 1): the Year 1 binomial test introduces the language of H0, H1, significance level, critical region, p-value. The Z-test re-uses every word, swapping the discrete binomial test statistic for the continuous Z.
Section O — Probability: the significance level α is P(reject H0∣H0 true) — a conditional probability. Misinterpreting α as P(H0 true) is the most common conceptual error and lies in elementary probability, not statistics.
PMCC / regression (Section T): the PMCC test connects hypothesis testing to bivariate data. Although calculation of r is excluded at A-level, its interpretation synthesises Section P (normality assumptions on each variable), Section S (hypothesis-testing language), and the broader framing of inference about populations.
Mechanics (Paper 2): quality-control problems in a manufacturing context (filling machines, packaging weights, component lengths) are ubiquitous in Paper 3 statistics questions but routinely reference physical contexts students meet in mechanics — units, tolerances, and engineering-meaningful conclusions matter.
Hypothesis-test questions on AQA 7357 Paper 3 split AO marks more evenly than typical pure-mathematics questions:
| AO | Typical share | Earned by |
|---|---|---|
| AO1 (knowledge / procedure) | 35–45% | Stating hypotheses, computing test statistic, looking up critical value, making the formal comparison |
| AO2 (reasoning / interpretation) | 35–45% | Choosing one- vs two-tailed correctly, interpreting significance level, writing the conclusion in context, defending the decision |
| AO3 (problem-solving / modelling) | 10–25% | Critiquing modelling assumptions (was σ really known? is the population normal? was sampling random?), commenting on sample size adequacy |
Examiner-rewarded phrasing: "Let μ denote the population mean ..."; "Under H0, Xˉ∼N(μ0,σ2/n)"; "Since ∣z∣=2.00>1.96, the test statistic lies in the critical region"; "There is sufficient evidence at the 5% significance level to suggest ...". Phrases that lose marks: "the mean is 498.4" (confuses sample with population); "we accept H0" (statisticians do not accept H0 — they fail to reject it); "the test proves ..." (statistical tests provide evidence, not proof); writing the conclusion only in symbolic terms with no contextual statement.
A specific AQA pattern to watch: questions phrased "test whether there is evidence that ..." with a direction in the stem (e.g. "evidence that the mean has increased") demand a one-tailed test. Questions phrased "test whether the mean has changed" demand a two-tailed test. Reading this distinction wrong inverts the critical value and usually inverts the decision — losing every single mark from Step 4 onwards.
Question: State, with reasons, whether each of the following hypothesis-test setups uses a one-tailed or a two-tailed test:
(i) A teacher tests whether the mean exam score has changed since last year.
(ii) An engineer tests whether a new alloy has greater tensile strength than the standard alloy.
(iii) A pharmacist tests whether a drug's mean dissolution time differs from the manufacturer's stated value.
Grade C response (~150 words):
(i) Two-tailed because "changed" goes either way.
(ii) One-tailed because we want greater than.
(iii) Two-tailed because "differs" means up or down.
Examiner commentary: Full marks (3/3) for correct identifications. The reasoning is brief but the key trigger words ("changed", "greater", "differs") are correctly mapped to the right tails. This is what a 3-mark question of this kind looks like at Grade C — efficient, accurate, no decoration.
Grade A response (~190 words):*
(i) Two-tailed. "Changed" admits both increase and decrease, so H1:μ=μ0. The critical region is split: ∣z∣>zα/2.
(ii) One-tailed (upper). "Greater" specifies a single direction, so H1:μ>μ0. The full α is in the upper tail; the critical value is zα, not zα/2.
(iii) Two-tailed. "Differs" carries the same logic as "changed" — the manufacturer's value could be low or high. H1:μ=μ0.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.