You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
This lesson covers how to compare two (or more) data sets using averages and measures of spread — a skill that appears regularly on the AQA GCSE Mathematics exam. Simply calculating values is not enough; you must be able to write meaningful comparison statements that interpret the numbers in context.
In many real-life situations, we need to compare data sets to make decisions or draw conclusions. For example:
To compare distributions, you use two types of measure:
| Type | Purpose | Examples |
|---|---|---|
| Measure of average (central tendency) | Summarises the typical value | Mean, median, mode |
| Measure of spread (dispersion) | Shows how spread out the data is | Range, IQR |
When comparing two distributions, you should always make two statements:
Exam Tip: The AQA mark scheme almost always requires two comparison statements: one about average and one about spread. Writing only one will lose you marks. Each statement must be in context (referring to the actual situation, not just numbers).
A good comparison statement has three parts:
For average: "The [mean/median] for [Group A] is [higher/lower] than for [Group B] ([value A] compared to [value B]), which shows that on average, [Group A] [did better/worse / had more/less / was higher/lower]."
For spread: "The [range/IQR] for [Group A] is [larger/smaller] than for [Group B] ([value A] compared to [value B]), which shows that [Group A's] data is [more/less] consistent (or spread out)."
Two classes took the same test. Here are the results:
| Class A | Class B | |
|---|---|---|
| Mean | 64 | 58 |
| Median | 62 | 60 |
| Range | 45 | 28 |
| IQR | 18 | 12 |
Comparison statements:
"The mean score for Class A (64) is higher than for Class B (58), which shows that on average, Class A performed better in the test."
"The range for Class A (45) is larger than for Class B (28), which shows that the scores in Class A were more spread out and less consistent than in Class B."
Or using IQR:
graph TD
A[Comparing Distributions] --> B[Step 1: Compare an Average]
A --> C[Step 2: Compare a Measure of Spread]
B --> D[Mean or Median]
C --> E[Range or IQR]
D --> F[State which is higher/lower]
E --> G[State which is larger/smaller]
F --> H[Interpret in context]
G --> I[Interpret in context]
Exam Tip: Never just write "Class A is higher" without saying what is higher. Always name the measure: "The median for Class A is higher than for Class B." Then explain what this means in the context of the question. Generic statements without context will not earn full marks.
| Situation | Best Average | Best Spread |
|---|---|---|
| Data with no outliers | Mean | Range |
| Data with outliers or skewed data | Median | IQR |
| Comparing box plots | Median | IQR |
| Comparing from grouped frequency tables | Estimated mean | Cannot find exact range; comment on class intervals |
| Data is categorical | Mode | Not applicable |
The range is affected by extreme values (outliers). A single very high or very low value can make the range misleadingly large.
The IQR only considers the middle 50% of the data, so it is not affected by outliers and gives a more reliable measure of spread.
Exam Tip: If you have a choice between using range and IQR, state that the IQR is preferred because it is "not affected by extreme values" and gives a better representation of the typical spread.
Box plots are particularly useful for comparing distributions because they display the median, quartiles, IQR, and range visually.
Two box plots show the times (in seconds) for students in two sports clubs to complete a sprint.
| Measure | Club A | Club B |
|---|---|---|
| Minimum | 11.2 | 10.8 |
| Q1 | 12.5 | 12.0 |
| Median | 13.8 | 13.2 |
| Q3 | 15.0 | 13.9 |
| Maximum | 17.5 | 15.2 |
| IQR | 2.5 | 1.9 |
| Range | 6.3 | 4.4 |
Comparison:
"The median time for Club A (13.8 seconds) is higher than for Club B (13.2 seconds), which shows that on average, Club B completed the sprint faster than Club A."
"The IQR for Club A (2.5 seconds) is greater than for Club B (1.9 seconds), which shows that Club B's times were more consistent and less spread out than Club A's."
When comparing distributions from grouped frequency tables, use the estimated mean and comment on the modal class and the class containing the median.
Two factories recorded the weights (in grams) of their products.
Factory X:
| Weight (w grams) | Frequency |
|---|---|
| 95 < w ≤ 100 | 5 |
| 100 < w ≤ 105 | 18 |
| 105 < w ≤ 110 | 12 |
| 110 < w ≤ 115 | 5 |
Estimated mean for Factory X = (97.5x5 + 102.5x18 + 107.5x12 + 112.5x5) / 40 = (487.5 + 1845 + 1290 + 562.5) / 40 = 4185 / 40 = 104.625 g
Factory Y:
| Weight (w grams) | Frequency |
|---|---|
| 95 < w ≤ 100 | 2 |
| 100 < w ≤ 105 | 10 |
| 105 < w ≤ 110 | 22 |
| 110 < w ≤ 115 | 6 |
Estimated mean for Factory Y = (97.5x2 + 102.5x10 + 107.5x22 + 112.5x6) / 40 = (195 + 1025 + 2365 + 675) / 40 = 4260 / 40 = 106.5 g
Comparison:
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.