You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Understanding data types and sampling methods is the foundation of all statistics work in GCSE Mathematics. Before you can analyse data, you need to know what kind of data you are dealing with and how to collect it reliably.
| Type | Definition | Examples |
|---|---|---|
| Qualitative (categorical) | Non-numerical; describes qualities or categories | Favourite colour, type of car, eye colour |
| Quantitative | Numerical; can be measured or counted | Height, number of siblings, temperature |
Quantitative data is further divided:
| Type | Definition | Examples |
|---|---|---|
| Discrete | Can only take specific values (often whole numbers); you count it | Number of pets (0, 1, 2, 3, …), shoe size (5, 5.5, 6, …), dice score |
| Continuous | Can take any value within a range; you measure it | Height (172.4 cm), time (13.562 s), mass (68.3 kg) |
Key tip: Ask yourself "Can the value be 3.7 and anything in between?" If yes, it is continuous.
| Type | Definition | Advantages | Disadvantages |
|---|---|---|---|
| Primary | Data you collect yourself (surveys, experiments, observations) | Tailored to your purpose; you control accuracy | Time-consuming and expensive |
| Secondary | Data collected by someone else (internet, newspapers, databases) | Quick to obtain; large samples available | May not be exactly what you need; may be out of date or biased |
Every member of the population has an equal chance of being selected.
How to do it:
A school has 1,200 students. The head teacher wants a sample of 60. She assigns each student a number from 0001 to 1200 and uses a random number generator to pick 60 numbers.
Solution: Each of the 1,200 students has probability 60 ÷ 1200 = 1/20 of being selected. Because every student has an equal chance, the sample is unbiased. The head teacher must obtain all 60 selected students' responses — if some refuse, the sample becomes biased through non-response.
| Advantages | Disadvantages |
|---|---|
| Free from bias | Need a complete list of the population (sampling frame) |
| Easy to understand | May not be representative if sample is small |
Select every kth member from a list after a random start.
How to do it:
Population = 500, sample size = 25. Choose a systematic sample.
Solution:
| Advantages | Disadvantages |
|---|---|
| Simple and quick | Need a list; if there is a pattern in the list, results may be biased |
The population is divided into strata (groups) based on a characteristic (e.g. year group, gender). A proportional sample is taken from each stratum.
Formula:
Number from stratum = (number in stratum ÷ total population) × sample size
A school surveys 50 students. The year groups are:
| Year group | Number of students |
|---|---|
| Year 7 | 180 |
| Year 8 | 200 |
| Year 9 | 190 |
| Year 10 | 160 |
| Year 11 | 170 |
| Total | 900 |
Solution:
Check: 10 + 11 + 11 + 9 + 9 = 50 ✓
Within each stratum, students are selected using a random method.
A factory employs 80 line workers, 15 supervisors and 5 managers. A stratified sample of 20 is required.
Solution:
A survey of 400 people gave a stratified sample of 60. One stratum (18–25 year olds) contained 120 people in the original population. How many 18–25 year olds are in the sample?
Solution: Number in sample = (120 ÷ 400) × 60 = 0.3 × 60 = 18 people.
| Advantages | Disadvantages |
|---|---|
| Guarantees proportional representation | Need to know the strata sizes |
| More representative than simple random | More complex to organise |
Bias means the sample is not representative of the population — certain groups are over- or under-represented.
| Source | Example |
|---|---|
| Non-random selection | Only surveying your friends |
| Time of collection | Surveying a high street at 10 a.m. on Monday misses workers |
| Leading questions | "Don't you agree that …?" pushes people towards an answer |
| Non-response | People who don't reply may have different views |
| Too small a sample | Fewer people = less likely to be representative |
| Location bias | Only surveying at one location |
Priya wants to find out how much time students at her school spend on homework. She decides to ask all 30 students in her maths class. Give two reasons why this may not give a representative sample, and suggest an improvement.
Solution:
Improvement: Use stratified random sampling across all year groups to get a proportional sample of ~30 students, selected randomly within each year.
A researcher surveys shoppers outside a supermarket at 11 a.m. on a Tuesday to estimate the average household's weekly shop. State two sources of bias.
Solution:
Edexcel exam tip: A common question asks you to criticise a sampling method. Look for: too small, not random, only one location, leading questions, or a specific time that excludes groups.
Answers:
A secondary school has students across three year groups: Year 9 has 240 students, Year 10 has 180 students and Year 11 has 180 students. The head of year wants a stratified sample of 50 students to take part in a lesson-time survey. How many students should be chosen from each year group?
Step 1: Find the total population: 240+180+180=600.
Step 2: Find the sampling fraction: 60050=121.
Step 3: Apply the fraction to each stratum.
| Year group | Population | Calculation | Sample |
|---|---|---|---|
| Year 9 | 240 | 240×121 | 20 |
| Year 10 | 180 | 180×121 | 15 |
| Year 11 | 180 | 180×121 | 15 |
| Total | 600 | 50 |
Check: 20+15+15=50. Always add the strata back together as a final sanity check — many students lose a mark through arithmetic slips at this stage.
Callout — rounding rule: If a calculation gives a non-integer (e.g. 14.6), round to the nearest whole number, but check the total still equals the required sample size. If not, adjust by one in the strata where rounding was closest.
A town council wants the opinions of residents about a new cycle lane. They stand at a bus stop at 10am on a Tuesday and ask the first 60 people who pass.
Discuss whether this produces a representative sample.
Solution: The sample is biased for several reasons:
A better method would be a stratified random sample taken from the full electoral roll, using categories such as age band and mode of transport so that each group is proportionally represented.
A biologist catches 80 fish from a lake, tags them, and releases them. One week later she catches 100 fish, of which 16 are tagged. Estimate the total population of fish in the lake.
Solution: Using the capture-recapture formula total sampletagged in sample=populationtotal tagged:
10016=N80
Rearranging: N=1680×100=168000=500 fish.
Assumptions: No fish entered or left the lake; tags did not fall off; tagged fish mix uniformly with untagged fish; tagged and untagged fish are equally likely to be caught.
Rewrite the question: "Don't you agree that the new uniform is much better than the old one?" to remove bias.
Solution: The question is leading (it suggests the "correct" answer). A neutral version:
"How do you rate the new uniform compared with the old one? Much better / Slightly better / About the same / Slightly worse / Much worse."
Balanced response options with equal positive and negative choices reduce the risk of skewing results.
Exam-style question (4 marks): A gym has 2,400 members. The owner wants a stratified sample of 60 members across age bands. There are 600 under-25s, 1,200 members aged 25–50 and 600 over-50s. Work out how many members from each band should be chosen and explain why stratified sampling is appropriate.
Edexcel alignment: This content is aligned with Edexcel GCSE Mathematics (1MA1) specification — specifically Topic S (S1 Sampling). Assessed on Papers 2 and 3.