You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Problem solving in statistics at A-Level goes beyond applying formulas mechanically. It requires you to interpret results in context, choose appropriate methods, evaluate models, and communicate conclusions clearly. The AQA A-Level Mathematics specification (7357) assesses statistics across Paper 3 (Section B), and many questions test AO3 problem-solving skills through real-world scenarios.
One of the most important skills in A-Level statistics is interpreting your mathematical results in the context of the problem. Every hypothesis test conclusion, every correlation interpretation, and every probability calculation should be related back to the real-world scenario described in the question.
Scenario: A factory claims that the mean mass of its bags of flour is 1.5 kg. A random sample of 30 bags has a mean mass of 1.48 kg. The known population standard deviation is 0.06 kg. Test at the 5% significance level whether the mean mass is less than 1.5 kg.
Mathematical working:
H₀: μ = 1.5 H₁: μ < 1.5 (one-tailed test)
Under H₀: X̄ ~ N(1.5, 0.06²/30) = N(1.5, 0.00012)
z = (1.48 − 1.5) / √(0.00012) = −0.02 / 0.01095 = −1.826
Critical value at 5% one-tailed: z = −1.6449.
Since −1.826 < −1.6449, the test statistic is in the critical region.
Contextual conclusion: "There is sufficient evidence at the 5% significance level to reject H₀. The data suggests that the mean mass of bags of flour produced by this factory is less than 1.5 kg."
Exam Tip: AQA mark schemes always require conclusions to be stated in context. A generic statement like "reject H₀" will not earn full marks. You must refer to the specific scenario — in this case, the bags of flour.
At A-Level, you need to be able to select the correct statistical test or distribution. The following guide helps:
| Situation | Distribution/Test |
|---|---|
| Number of successes in n independent trials with constant probability p | Binomial: X ~ B(n, p) |
| Number of events in a fixed interval (rare, random, independent) | Poisson (not in base A-Level) |
| Continuous data, known/large sample | Normal distribution |
| Testing a population proportion | Binomial test |
| Testing a population mean (large sample, known σ) | z-test using Normal distribution |
| Correlation between two variables | PMCC and hypothesis test for ρ |
Scenario: A company claims that 15% of its products are defective. A quality control inspector tests 20 products.
The appropriate model is X ~ B(20, 0.15), where X is the number of defective products.
Why? There are a fixed number of trials (n = 20), each trial has two outcomes (defective or not), the probability of being defective is constant (p = 0.15, assuming the claim is true), and trials are independent.
Scenario: The heights of adult women in a certain population are normally distributed with mean 162 cm and standard deviation 7 cm.
(a) Find the probability that a randomly selected woman is taller than 170 cm.
P(X > 170) = P(Z > (170 − 162)/7) = P(Z > 1.143) = 1 − Φ(1.143) = 1 − 0.8735 = 0.1265
(b) Contextual interpretation: "Approximately 12.65% of adult women in this population are taller than 170 cm, which is roughly 1 in 8."
(c) Critique of the model: The normal distribution is a reasonable model for heights because:
However, the model assumes:
Scenario: A researcher collects data on the hours of revision and exam scores for 12 students. The product moment correlation coefficient (PMCC) is r = 0.73.
Test at the 5% significance level whether there is positive correlation between hours of revision and exam score.
H₀: ρ = 0 (no correlation) H₁: ρ > 0 (positive correlation)
From the table of critical values for n = 12 at 5% one-tailed: r_crit = 0.4973.
Since 0.73 > 0.4973, reject H₀.
Contextual conclusion: "There is sufficient evidence at the 5% significance level of a positive correlation between hours of revision and exam score for these students. As revision hours increase, exam scores tend to increase."
Important: Correlation does not imply causation. Even though the data shows a positive correlation, we cannot conclude that more revision causes higher exam scores. There may be confounding variables (e.g., student motivation, prior knowledge).
The following structure applies to all hypothesis tests:
Scenario: A coin is suspected of being biased towards heads. In 20 flips, 14 heads are observed. Test at the 5% significance level.
H₀: p = 0.5 (coin is fair) H₁: p > 0.5 (coin is biased towards heads)
Under H₀: X ~ B(20, 0.5).
P(X ≥ 14) = 1 − P(X ≤ 13) = 1 − 0.9423 = 0.0577.
Since 0.0577 > 0.05, do not reject H₀.
Contextual conclusion: "There is insufficient evidence at the 5% significance level to conclude that the coin is biased towards heads. The observed 14 heads in 20 flips could reasonably have occurred by chance with a fair coin."
AQA requires familiarity with the Large Data Set (LDS), which contains weather data from various UK and worldwide stations. In exams, you may be asked to:
Exam Tip: Familiarise yourself with the structure of the AQA Large Data Set before the exam. Know which variables are recorded, what units they use, and what common anomalies exist (e.g., "tr" for trace rainfall, "n/a" for missing data).
When assessing whether a normal distribution is a suitable model:
Exam Tip: AQA examiners frequently penalise students who give correct mathematical answers but fail to interpret them in context. After every calculation, ask yourself: "What does this mean for the bags of flour / the coin / the students' revision?" Write your answer in plain English, referring to the real-world scenario. This is the difference between a good answer and a full-marks answer.
AQA 7357 Paper 3 — Statistics, Section A: problem-solving in statistics is the synoptic skin around content sub-sections N (Statistical sampling), O (Data presentation and interpretation), P (Probability), Q (Statistical distributions — binomial and normal) and R (Statistical hypothesis testing). The exam-board phrasing emphasises four assessable behaviours: (i) identify an appropriate distribution from context, (ii) check whether modelling conditions are met, (iii) choose a suitable test or calculation, and (iv) evaluate the model in light of the data and any criticisms. These map onto AOs roughly as AO1 ≈ 30%, AO2 ≈ 30%, AO3 ≈ 40% for Section A modelling questions — the AO3 share is the highest of any topic in the entire 7357 specification.
The AQA formula booklet lists the binomial probability P(X=r)=(rn)pr(1−p)n−r and the standardising relation Z=(X−μ)/σ, but does not list the conditions for binomial validity, the normal-approximation rule of thumb, or the structure of a hypothesis-test conclusion. Those must be memorised in the candidate's own words.
Question (8 marks): A regional charity claims that 35% of households in its catchment area donate to it at least once a year. A new fundraising manager surveys a random sample of n=200 households and finds that 84 have donated.
(a) State the conditions under which X, the number of donating households in a sample of size n, can be modelled as X∼B(n,p). (2)
(b) Using a suitable normal approximation, test at the 5% significance level whether the proportion of donating households is greater than the charity's claim. (6)
Solution with mark scheme:
(a) B1 (AO1) — independence: each household's decision to donate is independent of every other household's decision. B1 (AO2) — fixed parameter p: the probability of donating is the same for each household sampled. (Implicit additional conditions — fixed n and two outcomes per trial — are part of the binomial set-up and not always required to be stated; AQA accepts any two of the four standard conditions.)
(b) Step 1 — set up hypotheses. Let p be the true population proportion of donating households.
H0:p=0.35H1:p>0.35
B1 (AO1) — correct hypotheses with p defined in context. A common loss is writing H0:xˉ=0.35 — the parameter is the population proportion, not a sample mean.
Step 2 — model under H0. If H0 is true, then X∼B(200,0.35) with μ=np=70 and σ2=np(1−p)=200⋅0.35⋅0.65=45.5, so σ=45.5≈6.745.
Check the approximation conditions: np=70≥5 and n(1−p)=130≥5, so the normal approximation X≈N(70,45.5) is appropriate.
M1 (AO3) — explicitly checking the approximation conditions. Skipping this check is the single biggest mark-loser on Section A modelling questions.
Step 3 — apply continuity correction and standardise.
P(X≥84)≈P(Z≥45.583.5−70)=P(Z≥2.001)
M1 (AO1) — correct continuity correction (83.5, not 84 or 84.5, because we want X≥84). M1 (AO1) — correct standardisation.
Step 4 — read tail probability. From standard normal tables, P(Z≥2.00)≈1−0.9772=0.0228.
Step 5 — compare with significance level and conclude in context.
Since 0.0228<0.05, reject H0. There is sufficient evidence at the 5% level to support the manager's belief that the proportion of donating households is greater than the charity's claim of 35%.
A1 (AO2) — correct comparison and rejection decision. A1 (AO3) — conclusion phrased in context of donating households, not in pure statistical language.
Total: 8 marks (B3 M3 A2).
Question (6 marks, AO3-heavy): A weather modeller claims that the daily maximum April temperature in a particular town can be modelled as T∼N(14,9) (in °C, with variance 9). Over 30 randomly chosen April days, the sample mean was tˉ=15.4 °C.
(a) Discuss whether the normal distribution is a reasonable model for T, citing two features the modeller should check. (2)
(b) Treating the population variance as known, test at the 5% level whether the mean April temperature differs from 14 °C. State your conclusion in context. (4)
Mark scheme decomposition by AO:
(a)
(b)
Total: 6 marks split AO1 = 2, AO2 = 1, AO3 = 3. This question is unusual for AQA in being almost entirely AO3: the modelling critique in (a) and the contextual conclusion in (b) are both reasoning-heavy. Candidates who treat this as a "plug-and-chug" calculation routinely score 3/6.
Connects to:
Sub-section N — Statistical sampling. Every Section A problem assumes a "random sample". Critiquing the sampling method (cluster vs simple random vs stratified) is fair AO3 territory: a survey of households visited only on weekday mornings is biased toward households with at-home members, undermining independence even before any binomial conditions are checked.
Sub-section Q — Binomial and normal distributions. The "identify the distribution" step is the synoptic spine of Section A. Counting successes in fixed-size samples → binomial; continuous measurements with symmetric variation → normal; large-n binomial → normal approximation N(np,np(1−p)). The pivot between the two is np≥5 and n(1−p)≥5.
Sub-section R — Hypothesis testing. The structure H0, H1, test statistic, critical region/p-value, decision, contextual conclusion is invariant across binomial, normal-mean, and (Year 2) correlation tests. Internalising the structure once pays dividends three times.
Sub-section R — Correlation hypothesis test. When data come as (x,y) pairs, the modelling question shifts from "is the mean different?" to "is there a linear association?". The AQA test uses Pearson's product-moment correlation r with H0:ρ=0 versus H1:ρ=0 (or one-tailed). Same scaffold, different distribution under H0.
Sub-section O — Data presentation. Box plots, histograms and scatter graphs give visual evidence for distributional assumptions: heavy skew in a histogram is direct evidence against a normal model; a clear outlier in a scatter graph warns against fitting a Pearson correlation without a robustness comment.
Section A problem-solving questions split AO marks distinctively:
| AO | Typical share | Earned by |
|---|---|---|
| AO1 (knowledge / procedure) | 25–35% | Correct distribution formula, correct standardisation, correct table lookup, correct hypotheses |
| AO2 (reasoning / interpretation) | 25–35% | Choosing the appropriate distribution, justifying the approximation, comparing with critical value |
| AO3 (problem-solving / modelling) | 35–45% | Checking conditions, evaluating model fit, criticising sampling, framing conclusion in context |
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.