You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
An average is a single number that summarises a whole set of data — a "typical" value. GCSE Mathematics uses three different averages, the mean, the median and the mode, plus the range to describe how spread out the data is. Each average has strengths and weaknesses, and a large part of the skill at OCR GCSE Mathematics (J560) is choosing the right one and justifying that choice. This lesson covers calculating all three from lists and frequency tables, finding the range, dealing with outliers, and explaining which average best represents a data set.
The topic is heavily assessed across both papers. It blends AO1 (carrying out the calculations) with AO2 (choosing and justifying an appropriate average) and AO3 (working backwards from an average to a missing value). OCR command words here include "Work out", "Calculate", "Write down" and "Give a reason for your answer".
| Term | Meaning |
|---|---|
| Mean | The sum of all values divided by how many values there are |
| Median | The middle value when the data is put in order |
| Mode | The value that occurs most often |
| Range | The largest value minus the smallest value |
| Outlier | A value that is much larger or smaller than the rest |
| Bimodal | Having two modes |
| xˉ | The symbol for the mean of a data set |
The mean uses every value: add them all and divide by how many there are.
xˉ=number of valuessum of values
The median is the middle value once the data is in order. For n values, the median is in position 2n+1. If n is even this lands between two values, so you take their mean.
The mode is the most common value. A data set can have no mode (all values different), one mode, or two modes (bimodal). The mode is the only average that works for qualitative data such as favourite colour.
The range measures spread: range=largest−smallest. A small range means consistent data; a large range means widely spread data. The range is not an average — it describes variability, not the centre.
Find the mean, median, mode and range of: 4, 7, 7, 9, 13.
Solution:
Find the median of: 12, 5, 9, 3, 14, 8.
Solution: First order the data: 3, 5, 8, 9, 12, 14. There are n=6 values, so the median is between positions 3 and 4: the mean of 8 and 9 is 28+9=8.5.
Common error: finding the "middle" of the unordered list. You must always order the data first.
The shoe sizes of a team are: 8, 9, 9, 10, 10, 10, 11. Write down the mode and the range.
Solution: Mode =10 (occurs three times). Range =11−8=3.
A data set is: 6, 6, 9, 9, 12. State the mode.
Solution: Both 6 and 9 occur twice, so the data is bimodal: the modes are 6 and 9.
When data is in a frequency table, you do not list every value. To find the mean, add an f×x column (frequency multiplied by value), total it, and divide by the total frequency:
xˉ=∑f∑fx
The table shows the number of pets owned by 30 households. Find the mean, median and mode.
| Pets (x) | Frequency (f) | f×x |
|---|---|---|
| 0 | 5 | 0 |
| 1 | 11 | 11 |
| 2 | 8 | 16 |
| 3 | 4 | 12 |
| 4 | 2 | 8 |
| Total | 30 | 47 |
Solution:
Common error: dividing ∑fx by the number of rows (5) instead of by ∑f (30).
The table shows the number of matches in 40 boxes.
| Matches | 48 | 49 | 50 | 51 | 52 |
|---|---|---|---|---|---|
| Frequency | 3 | 9 | 16 | 8 | 4 |
Work out the mean number of matches per box.
Solution: ∑fx=48×3+49×9+50×16+51×8+52×4=144+441+800+408+208=2001. Then xˉ=402001=50.025 matches.
For the match data in Worked Example 6, find the median and the mode.
Solution: n=40, so the median is the mean of the 20th and 21st values. Cumulative frequencies: 3, 12, 28, 36, 40. Both the 20th and 21st values fall in the "50" row, so the median is 50 matches. The mode is the most frequent value, 50 matches (frequency 16).
No single average is always best. Use the table below.
| Average | Best when … | Weakness |
|---|---|---|
| Mean | data is numerical with no extreme values | distorted by outliers |
| Median | data has outliers or is skewed | ignores the actual sizes of most values |
| Mode | data is qualitative, or you want the most popular choice | may not exist, or may not be central |
The salaries (in £000s) of seven employees in a small firm are: 19, 21, 22, 24, 25, 27, 96. (a) Find the mean and the median. (b) Which average better describes a typical salary? Give a reason.
Solution: (a) Mean =719+21+22+24+25+27+96=7234=33.43, i.e. £33,428.57. Median = 4th of 7 ordered values =24, i.e. £24,000. (b) The median (£24,000) is better. The value £96,000 is an outlier (probably the owner) that pulls the mean far above what most employees actually earn, so the mean of £33,429 is not typical.
The reasoning in part (b) is exactly what OCR is looking for when a question says "Give a reason for your answer". A bare answer of "median" earns little; the marks come from explaining that the £96,000 figure is unusually high, that it drags the mean upward, and therefore the median better represents a typical employee. Notice how all three averages tell a different story here: the mode would not even exist (every salary is different), the mean is distorted, and only the median sits sensibly in the middle of what most people earn. Whenever a data set has one value standing well apart from the rest, expect the median to be the fairer summary, and be ready to justify it in words.
The numbers of goals scored by a player in eleven matches are: 0, 0, 1, 1, 1, 2, 2, 2, 2, 3, 14. (a) Find the mean, median and mode. (b) Which average would the player's agent quote to make the player look best, and which gives the fairest picture? Explain.
Solution: (a) Mean =110+0+1+1+1+2+2+2+2+3+14=1128=2.55 goals (2 d.p.). Median = 6th of 11 ordered values =2 goals. Mode =2 goals (occurs four times). (b) The agent would quote the mean (2.55) because it is the highest, but it is inflated by the single 14-goal match (an outlier). The median (2) gives a fairer picture of a typical match, and here the mode also agrees at 2, so 2 goals is the most representative figure.
An outlier is a value far from the rest of the data. Outliers can be genuine extreme readings or mistakes in recording. They greatly affect the mean and the range, but barely affect the median and the mode. When an outlier is present, the median is usually the more representative average.
The times (in minutes) for eight students to complete a task are: 12, 13, 13, 14, 15, 15, 16, 58. (a) Identify the outlier. (b) Find the range with and without it, and comment.
Solution: (a) The outlier is 58 minutes — far above the cluster 12–16. (b) Range with the outlier =58−12=46 minutes; without it =16−12=4 minutes. The single outlier makes the range eleven times larger, hugely overstating the true spread. This shows why the range is unreliable when outliers are present.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.