You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
This lesson covers the key concepts of statistical sampling as required by the A-Level Mathematics specification. Sampling is the process of selecting a subset of individuals from a population in order to make inferences about the whole. Understanding different sampling methods and their implications is essential for statistical analysis.
Spec Mapping — AQA 7357 Section K Statistical sampling. This lesson covers the populations-and-samples, sampling-methods and bias-criticism content of Section K. Refer to the official AQA specification document for exact wording.
A population is the entire set of items or individuals that are of interest in a statistical investigation. A census collects data from every member of the population, while a sample collects data from a subset.
| Term | Definition |
|---|---|
| Population | The whole set of items that are of interest |
| Sample | A subset of the population selected for study |
| Census | A survey that collects data from every member of the population |
| Sampling frame | A list of all members of the population from which a sample can be drawn |
| Sampling unit | Each individual member of the population that can be sampled |
| Method | Advantages | Disadvantages |
|---|---|---|
| Census | Completely accurate, no bias | Time-consuming, expensive, impractical for large populations |
| Sample | Cheaper, quicker, feasible for large populations | May not be representative, subject to sampling error |
Exam Tip: When asked to compare a census and a sample, always give at least one advantage and one disadvantage of each. Explain why a sample is more practical for large populations.
Every member of the population has an equal chance of being selected. This is achieved by assigning each member a number and using a random number generator or lottery method.
Steps:
Advantages: Free from bias, easy to implement with a suitable sampling frame. Disadvantages: Requires a complete sampling frame, may not be practical for very large populations.
Select every k-th item from the sampling frame, where k=sample sizepopulation size.
Steps:
Example: For a population of 500 and sample size of 50, k=50500=10. If the random start is 7, select the 7th, 17th, 27th, ... members.
Advantages: Simple to use, evenly spread across the population. Disadvantages: Can introduce bias if there is a periodic pattern in the data.
The population is divided into distinct strata (groups), and a proportional random sample is taken from each stratum.
Formula for the number from each stratum:
Number from stratum=population sizestratum size×total sample size
Example: A school has 600 students in Year 12 and 400 in Year 13. For a sample of 50:
Advantages: Guarantees proportional representation of each group. Disadvantages: Requires knowledge of the population structure, the strata must be clearly defined.
The interviewer selects a specified number of individuals from each group, but the choice of individuals is not random.
Advantages: Quick and cheap, no sampling frame required. Disadvantages: Prone to bias as the interviewer chooses who to include.
The sample is taken from those who are available at the time of the study.
Advantages: Easy to carry out. Disadvantages: Unlikely to be representative, highly prone to bias.
This topic connects to:
aqa-alevel-maths-large-data-set / data-cleaning-and-preparation) — every sample drawn from the AQA Large Data Set must be cleaned and screened for missing values before any inference is attempted.aqa-alevel-maths-statistics / hypothesis-testing) — the validity of every p-value rests on the sample being simple-random; sampling criticism is the AO3 layer of every hypothesis-test conclusion.aqa-alevel-maths-problem-solving / mathematical-modelling) — sampling-design critique is a transferable modelling skill assessed across Papers 1, 2 and 3.Exam Tip: Be prepared to calculate the number of items to sample from each stratum in stratified sampling. Show your working clearly — you will often need to round your answer and explain any adjustments.
AQA 7357 specification, Paper 3 — Statistics, Section O — Statistical sampling: "Understand and use the terms 'population' and 'sample'. Use samples to make informal inferences about the population. Understand and use sampling techniques, including simple random sampling and opportunity sampling. Select or critique sampling techniques in the context of solving a statistical problem, including understanding that different samples can lead to different conclusions about the population." Section O is short on bullet points but heavy on examined applications: every Paper 3 carries at least one sampling-criticism question, often worth 4–6 marks, and the AQA Large Data Set is the standard vehicle. Sampling links forward into Section P (Data presentation and interpretation), Section R (Statistical distributions) and Section S (Statistical hypothesis testing) — a poor sample invalidates every later inference, so AQA examines sampling explicitly to see whether candidates can criticise designs, not merely execute them. The AQA formula booklet does not list sampling formulae — proportional-allocation calculations and definitions must be carried in the head.
Note: this question is constructed to model AQA Paper 1/2/3 style; it is not a reproduction of any published past paper.
Question (8 marks):
A college has 1,200 students enrolled across four faculties: Arts (300 students), Sciences (450 students), Humanities (240 students) and Vocational (210 students). The principal wants a sample of 80 students to take part in a survey on study habits.
(a) Identify which sampling method would be most appropriate, justifying your choice with reference to the structure of the population. (3)
(b) Using your chosen method, calculate the number of students sampled from each faculty, showing that the total is 80. (3)
(c) Explain one limitation of your chosen method in this context. (2)
Solution with mark scheme:
(a) Step 1 — identify the population structure.
The population is divided into four faculties of unequal size (300, 450, 240, 210). Faculties are likely to differ systematically in study habits — Sciences students may report different patterns from Arts students. Each faculty therefore forms a meaningful stratum.
M1 (AO3.1a) — recognising that the population has natural strata of unequal size and that the strata are likely to differ on the variable of interest.
Step 2 — name the method and justify.
Stratified sampling is most appropriate. By taking a sample from each faculty in proportion to its size, the sample mirrors the population structure and reduces sampling bias relative to simple random sampling on the whole list.
A1 (AO2.4) — naming "stratified sampling" explicitly.
A1 (AO3.1b) — justifying with reference to both (i) unequal stratum sizes and (ii) likely between-stratum variation in the response variable. Saying "stratified is best because it's representative" without naming the strata loses this mark.
(b) Step 1 — compute the sampling fraction.
The overall fraction is 120080=151.
M1 (AO1.1a) — correct sampling fraction.
Step 2 — apply proportional allocation to each stratum.
| Faculty | Size | Sample size |
|---|---|---|
| Arts | 300 | 300×151=20 |
| Sciences | 450 | 450×151=30 |
| Humanities | 240 | 240×151=16 |
| Vocational | 210 | 210×151=14 |
A1 (AO1.1b) — all four sample sizes correct as integers.
Step 3 — verify the total.
20+30+16+14=80, as required.
A1 (AO2.5) — verification stated explicitly. The "showing that" command requires this final additive check; omitting it is a presentation slip that costs the mark on stricter mark schemes.
(c) Step 1 — identify a context-specific limitation.
Within each faculty, students must still be selected — typically by simple random sampling from a faculty list. If the only available list is incomplete (e.g. excludes part-time students), the sample within each stratum will not be truly random, reintroducing bias even though the proportional allocation is correct.
B1 (AO3.5a) — naming a concrete limitation tied to the context (incomplete sampling frames within strata, or non-response varying by faculty, or strata being too coarse — for instance Sciences combining Biology and Physics with potentially different study habits).
B1 (AO3.5b) — explaining how the limitation produces bias, not merely asserting that it does.
Total: 8 marks (M2 A4 B2). AO split: AO1 = 2, AO2 = 2, AO3 = 4.
Question (6 marks, AO3-heavy modelling): A market-research company is contracted to estimate the proportion of UK adults aged 18–65 who use a particular streaming service. The contract specifies a sample of 1,000 adults. The researcher proposes standing outside a major railway station in central London on a Tuesday morning and approaching the first 1,000 adults who appear willing to talk.
(a) Identify the sampling method proposed. (1)
(b) Critique this design, identifying three distinct sources of bias and explaining the direction in which each would distort the estimated proportion. (5)
Mark scheme decomposition by AO:
(a)
(b) Three of the following, each worth up to 2 marks (1 for identification, 1 for direction of bias). Cap at 5 marks total — typically B2 + B2 + B1, or B2 + B1 + B1 + the structural mark.
AO split: AO1 = 1, AO2 = 1, AO3 = 4. This is an unusually AO3-heavy question — AQA reserves these for Paper 3 sampling-criticism items because the modelling skill (linking design choices to inferential consequences) cannot be tested any other way.
Connects to:
Section P — Data presentation and interpretation: the AQA Large Data Set (weather data from UK and overseas Met Office stations) is the standard vehicle for examining sampling. Paper 3 questions routinely ask candidates to take a stratified sample by location-and-month, calculate summary statistics, and compare them with population values calculated from the full data set. Familiarity with the structure of the Large Data Set — which stations, which months, which variables — is therefore tested through sampling.
Section Q — Probability: simple random sampling is the operational definition that makes the binomial and Poisson distributions valid models. X∼B(n,p) assumes n independent trials with constant probability p — exactly the structure of sampling with replacement (or sampling without replacement from a population large enough that p is approximately constant). Critiquing a sampling method is, equivalently, critiquing whether the binomial assumptions hold.
Section R — Statistical distributions: the normal distribution arises in sampling theory through the Central Limit Theorem: regardless of the population distribution, the sample mean Xˉ of a simple random sample of size n has approximately N(μ,σ2/n) for large n. This justification is not on the A-Level specification but underpins every confidence-interval and hypothesis-testing question on Paper 3.
Section S — Statistical hypothesis testing: every hypothesis test on Paper 3 begins "a random sample of size n is taken …". The validity of the resulting p-value rests entirely on the sample being random. AQA examiners reward candidates who, when concluding a test, add a sentence like "this conclusion assumes the sample was simple random; if the sample was opportunistic, the test is invalid". This synoptic awareness is exactly the AO3 mark.
Section T — Statistical inference (correlation): Pearson's correlation coefficient r and any associated significance test require a bivariate random sample. A stratified sample by region is not bivariate-random for the purposes of a correlation test between, say, mean temperature and mean rainfall — strata can induce spurious correlation through Simpson's paradox. This connection is rarely tested directly but appears in the harder Paper 3 modelling questions.
Sampling questions on AQA 7357 split AO marks heavily toward AO2 and AO3:
| AO | Typical share | Earned by |
|---|---|---|
| AO1 (knowledge) | 20–30% | Naming the method correctly; computing proportional allocation arithmetic; defining "population" and "sample" precisely |
| AO2 (reasoning) | 25–35% | Justifying choice of method against the population structure; explaining why a method is or is not representative; presenting allocation tables clearly |
| AO3 (problem-solving / modelling) | 40–55% | Critiquing real-world sampling designs; identifying multiple distinct bias sources; explaining the direction of bias on the parameter estimate; recommending design improvements |
Examiner-rewarded phrasing: "stratified by [variable] because [variable] is correlated with [response]"; "this would over-estimate p because the excluded group has lower usage"; "the sampling frame excludes X, so the population about which we can infer is restricted to Y"; "non-response is likely to be informative because [reason]". Phrases that lose marks: "the sample is biased" (without naming the source or direction); "this is a fair sample because it's random" (random does not imply representative without further justification); "stratified sampling is always best" (the question is which method is best in this context).
A specific AQA pattern to watch: questions phrased "critique" or "comment on" expect multiple distinct points, not one elaborated point. A 5-mark critique question typically requires three separately identified issues; expanding a single bias source over five lines earns only the first mark.
Question: Define what is meant by a "simple random sample" of size n from a population of size N.
Mid-band response (~150 words):
A simple random sample is a sample where everyone has the same chance of being picked. You pick n people from N randomly, for example by giving each person a number and using a random number generator to choose which numbers to include. This means the sample should be fair and not biased.
Examiner commentary: 2 out of 3. The candidate captures the equal-probability idea informally and gives a workable mechanism. What is missing is the second defining property: every sample of size n is equally likely to be selected, not merely every individual. A sample where the first person is chosen randomly and the next n−1 are their friends gives every individual a positive (perhaps even equal) chance, but is patently not simple random. The B mark for the second property is the differentiator between bands.
Top-band response (~190 words):
A simple random sample of size n from a population of size N is a sample selected so that every possible subset of n distinct individuals from the population has an equal probability of being chosen, namely 1/(nN). Equivalently — and more usefully in practice — every individual in the population has the same probability n/N of being included in the sample, and selections are made independently (or, for sampling without replacement, conditionally so that the equal-subset property holds). A standard implementation is to assign each member of the sampling frame a unique integer 1,2,…,N and use a random number generator to draw n distinct integers without replacement.
Examiner commentary: Full marks (3/3). Both defining properties are stated — equal probability of individuals and of subsets — and an operational mechanism is given. The use of (nN) signals technical fluency, though it is not required for full marks. The phrase "sampling frame" is precisely the kind of vocabulary AQA mark schemes credit.
Question: A factory produces 10,000 components per day on three production lines (Line A: 5,000; Line B: 3,000; Line C: 2,000). A quality-control inspector wishes to test 100 components per day for defects.
(a) Describe how a stratified sample by production line could be obtained. (3)
(b) Explain one advantage and one disadvantage of stratified sampling compared with simple random sampling in this context. (3)
Stronger response (~260 words):
(a) Take 10000100=1001 of each line. So 50 from Line A, 30 from Line B, 20 from Line C. Then pick the components randomly from each line.
(b) Advantage: you definitely get components from every line, so you can compare them. Disadvantage: it takes longer to organise because you have to sample each line separately.
Examiner commentary: (a) gets 2/3 — the proportional allocation is correct but the candidate doesn't specify how the random selection within each stratum is performed. The mark scheme requires "use simple random sampling within each stratum, e.g. by random number generator on a numbered list". (b) gets 2/3 — the advantage is correct but underdeveloped (better: "reduces variance of the estimate when between-line defect rates differ"); the disadvantage is operational rather than statistical (a stronger answer: "if the strata are not actually different in defect rate, stratification gives no statistical gain over SRS"). Total: 4/6.
Top-band response (~310 words):
(a) The sampling fraction is f=100/10000=0.01. Apply proportional allocation:
Total: 50+30+20=100 as required. Within each line, list the day's output sequentially 1,2,…,Ni and use a random-number generator to select the required number of distinct labels without replacement. This realises simple random sampling within each stratum.
(b) Advantage: if defect rates differ systematically by production line (e.g. Line C uses an older machine and produces more defects), stratified sampling guarantees representation from every line in the correct proportion, reducing the variance of the overall estimated defect rate compared with simple random sampling. With SRS there is a small but non-zero probability that, by chance, no Line C components are selected, which would entirely miss any line-specific defect.
Disadvantage: stratified sampling requires an accurate sampling frame for each stratum — the inspector must know in advance which components came from which line. If components are pooled before inspection, the strata cannot be reconstructed and stratified sampling is operationally impossible. SRS, requiring only a single overall list, is more robust to messy data.
Examiner commentary: Full marks (6/6). The allocation is shown with explicit verification; the within-stratum mechanism is named; both advantage and disadvantage are statistical (variance, sampling-frame requirement) rather than vague. The "small but non-zero probability that no Line C components are selected" sentence is the AO3 mark — converting an abstract claim into a probabilistic consequence.
Question: A researcher wants to estimate the mean daily maximum temperature for August at five UK weather stations from the AQA Large Data Set: Camborne, Heathrow, Hurn, Leeming and Leuchars. The full data set contains daily readings for August at each station.
(a) The researcher proposes simple random sampling: select 50 daily-station observations at random from the combined pool of 5×31=155 daily observations. Identify two weaknesses of this design relative to the research question. (3)
(b) Propose an alternative sampling design that addresses these weaknesses, giving the sample sizes per station and per week of August. (4)
(c) State one assumption your proposed design relies on, and how you would check it. (2)
Top-band response (~380 words):
(a) Weakness 1: Simple random sampling from the pooled 155 observations gives variable per-station representation. By chance, one station could be sampled 5 times and another 15 times, producing an unequal-precision estimate per station that the research question (which compares across stations) cannot tolerate.
Weakness 2: SRS ignores the within-month structure. August temperatures rise and fall over the month — sampling 50 observations at random risks clustering them in one part of August, which would bias the monthly-mean estimate if the random draw happened to favour heatwave dates.
(b) Use two-stage stratification: first by station, then by week.
Allocate equally across stations: 50/5=10 observations per station. Within each station, stratify by week of August. August has 31 days, naturally falling into roughly four weeks plus three days. Allocate per station: weeks 1–4 take 7 days each (giving 28 days), with 3 spare days assigned to whichever weeks the calendar produces. Sample 2 days per week per station from the first four weeks (2×4=8) plus 2 from the spare days, giving 10 per station.
Total: 10×5=50 observations. Each station is equally represented; within each station, each week of August is equally represented.
(c) Assumption: the within-week variation in daily maximum temperature is small relative to the between-week variation. If this holds, sampling 2 days per week captures the week's typical temperature accurately, and the across-week sample captures the monthly profile.
Check: compute the variance of daily maximum temperature within each week of August across all five stations, and compare to the variance between weekly means. If the within-week variance is much smaller than the between-week variance (say, by a factor of 4 or more), the assumption is supported. If they are comparable, the design wastes precision and a finer sub-stratification is warranted.
Examiner commentary: Full marks (9/9). The candidate identifies two genuinely distinct weaknesses (per-station imbalance, within-month structure), proposes a coherent two-stage design with arithmetic that totals to 50, and states a checkable assumption with an actual check procedure. The phrase "much smaller … by a factor of 4 or more" converts an abstract criterion into an operational one — the AO3 marker.
The errors that distinguish A from A* on sampling questions:
Confusing stratified with systematic. Stratified sampling divides the population into groups and samples from each; systematic sampling picks every kth element from an ordered list. A candidate who writes "I'll stratify by taking every 10th student" has conflated the two methods and forfeits the AO1 mark immediately.
Treating non-response as random. When 30% of contacted respondents decline, the remaining 70% are not a random sample of the original sample — they are a self-selected sub-sample whose response patterns may differ systematically. Saying "the sample size is now 700, so I'll proceed" without flagging non-response bias misses the AO3 critique mark every time.
Sample size vs sample method. A larger sample does not fix a biased method. A sample of 10,000 from a self-selecting online panel is more biased, not less, than a random sample of 1,000 — the bias is systematic, not random, so increasing n tightens the confidence interval around the wrong value. AQA examiners deliberately probe this with questions like "Will doubling the sample size remove the bias? Justify." The answer is no.
Quota sampling treated as stratified. Quota sampling fixes the number of respondents in each demographic category (e.g. 50 men, 50 women) but selects non-randomly within each quota. Stratified sampling fixes the proportion and selects randomly within each stratum. They are not the same — quota sampling lacks the probabilistic foundation that justifies inference.
Cluster vs stratified confusion. Stratified sampling samples within every group (every faculty contributes); cluster sampling samples whole groups (some faculties contribute, others contribute nothing). The two are almost opposites in design philosophy. A candidate who calls a cluster sample "stratified" forfeits both the identification mark and any subsequent justification.
"Random" used loosely. "I picked them randomly" without a mechanism (random-number generator, random-digit table, lottery draw) is opportunity sampling, not random sampling. Examiners require a named randomisation mechanism for the design to count as random.
Ignoring the sampling frame. Every sampling method requires a list of the population — the sampling frame. If the frame is incomplete (excludes certain types), even a perfectly executed simple random sample is biased, because the population being inferred about is the frame, not the true population. The frame–population gap is the most under-discussed bias source on student answers.
Three patterns repeatedly cost candidates marks on Paper 3 sampling-design questions. They are all about connecting design to consequence, not about technique.
This pattern is endemic to Paper 3 sampling questions: candidates know the methods, lose marks on the inferential consequences.
Sampling theory is one of the rare A-Level topics that scales directly into research practice across multiple disciplines:
Oxbridge interview prompt: "A pollster reports that 52% of voters support Party X, with a margin of error of ±3%. Their sample is 1,000 people contacted by landline telephone. What does the margin of error actually measure, and what does it not measure? Now suggest two ways the headline figure could be wrong by more than 3% even if the arithmetic is impeccable."
A common A* trap on AQA 7357 Paper 3 is to set proportional-allocation arithmetic that does not divide cleanly. The technique is to compute the exact proportional sample size, then round in a defensible way and verify the total.
Worked example: A school has 740 students in three year groups: Year 12 (310), Year 13 (270) and an extended-programme cohort (160). A sample of 100 students is required, stratified by year group.
Compute the sampling fraction: f=100/740=5/37≈0.1351.
Apply to each stratum:
Naive rounding to the nearest integer gives 42+36+22=100 — the totals balance. Notice: 41.89 rounds to 42, 36.49 rounds to 36, 21.62 rounds to 22. The arithmetic happens to balance.
Why A candidates verify and adjust:* when the rounded total is not 100 (e.g. 42+37+22=101), the standard procedure is to round the stratum with the largest fractional part up and adjust others down to hit the target. The phrase to use in an exam: "rounded to integers giving 42+36+22=100, which matches the required sample size; if the totals had not matched, I would adjust the stratum with the largest fractional remainder."
A subtlety: when the population size is not divisible by the sample size, exact proportional allocation is impossible — the best one can do is approximate. AQA mark schemes accept any rounding scheme that (i) is justified, (ii) totals correctly, and (iii) respects the relative ordering of stratum sizes. Candidates who round arbitrarily and produce a total of 99 or 101 lose the verification A1 every time.
This content is aligned with the AQA A-Level Mathematics (7357) specification, Paper 3 — Statistics, Section O: Statistical Sampling. For the most accurate and up-to-date information, please refer to the official AQA specification document.
graph TD
A["Population defined by<br/>research question"] --> B{"Sampling frame<br/>available?"}
B -->|"Yes, complete"| C{"Population<br/>structure?"}
B -->|"No / incomplete"| Z["Coverage bias<br/>flag explicitly"]
C -->|"Homogeneous"| D["Simple random sampling<br/>from full frame"]
C -->|"Heterogeneous strata,<br/>known sizes"| E["Stratified sampling<br/>proportional allocation"]
C -->|"Ordered list,<br/>no periodic pattern"| F["Systematic sampling<br/>every k-th element"]
C -->|"Geographic / group clusters,<br/>frame only at cluster level"| G["Cluster sampling<br/>whole groups selected"]
C -->|"Frame incomplete<br/>or unavailable"| H["Opportunity / quota<br/>flag bias direction"]
D --> I["Compute estimate<br/>and standard error"]
E --> I
F --> I
G --> I
H --> J["Critique:<br/>direction of bias?<br/>magnitude?"]
I --> K{"Non-response<br/>significant?"}
K -->|"Yes"| L["Investigate<br/>response mechanism"]
K -->|"No"| M["Inference valid<br/>for sampled population"]
J --> M
L --> M
Z --> J
style E fill:#27ae60,color:#fff
style M fill:#3498db,color:#fff