Statistical Skills: Mean, Median, Mode & Quartiles

Statistical techniques allow geographers to summarise data, identify patterns and test whether results are significant. At GCSE you need to be able to calculate and interpret a range of statistical measures. This lesson covers all the statistical skills required for AQA GCSE Geography.

Measures of Central Tendency

Central tendency measures tell you the typical or average value in a dataset.

Mean

The mean is the sum of all values divided by the number of values.

Formula: Mean = Σx / n

where Σx = sum of all values and n = number of values

Example: River pebble sizes (mm): 12, 15, 18, 22, 25, 30, 35

Mean = (12 + 15 + 18 + 22 + 25 + 30 + 35) / 7 = 157 / 7 = 22.4 mm

Advantages: Uses all the data; widely understood.

Disadvantages: Can be distorted by extreme values (outliers).

Median

The median is the middle value when the data is arranged in order from smallest to largest.

How to find it:

Arrange data in ascending order
If there is an odd number of values, the median is the middle one
If there is an even number of values, the median is the mean of the two middle values

Example: The same data in order: 12, 15, 18, 22, 25, 30, 35

Median = 22 mm (the 4th value out of 7)

Advantages: Not affected by extreme values; easy to find.

Disadvantages: Does not use all the data.

Mode

The mode is the most frequently occurring value.

Example: Pedestrian counts: 5, 8, 12, 12, 15, 18, 12, 20

Mode = 12 (appears 3 times)

A dataset can be:

Unimodal — one mode
Bimodal — two modes
No mode — if all values appear the same number of times

Advantages: Easy to find; useful for categorical data.

Disadvantages: May not be representative; there may be no mode or several modes.

Exam Tip: The exam may ask you to choose the most appropriate measure of central tendency. The median is usually best when there are outliers. The mean is best when data is evenly spread. The mode is best for categorical data.

Measures of Dispersion

Dispersion measures tell you how spread out the data is.

Range

The range is the difference between the highest and lowest values.

Formula: Range = maximum value − minimum value

Example: Maximum pebble size = 35 mm, minimum = 12 mm

Range = 35 − 12 = 23 mm

Advantages: Simple to calculate.

Disadvantages: Only uses two values; heavily affected by outliers.

Interquartile Range (IQR)

The IQR measures the spread of the middle 50% of the data, making it less affected by extreme values.

How to find it:

Arrange data in ascending order
Find the lower quartile (Q1) — the median of the lower half
Find the upper quartile (Q3) — the median of the upper half
IQR = Q3 − Q1

Example: Data (n=7): 12, 15, 18, 22, 25, 30, 35

Q1 (median of 12, 15, 18) = 15
Q3 (median of 25, 30, 35) = 30
IQR = 30 − 15 = 15 mm

Exam Tip: The IQR is a better measure of spread than the range because it ignores extreme values at either end. If asked to justify your choice of measure, this is the key point to make.

Percentiles and Quartiles

Term	Definition
Lower quartile (Q1)	The value below which 25% of the data falls
Median (Q2)	The value below which 50% of the data falls
Upper quartile (Q3)	The value below which 75% of the data falls
Interquartile range	Q3 − Q1 (the middle 50% of the data)

Standard Deviation

The standard deviation measures how far values are spread from the mean. A small standard deviation means data is closely clustered around the mean; a large standard deviation means data is widely spread.

Formula:

SD = √(Σ(x − x̄)² / n)

where x = each value, x̄ = mean, n = number of values

Step-by-step method:

Calculate the mean (x̄)
Subtract the mean from each value (x − x̄)
Square each result (x − x̄)²
Sum all the squared differences: Σ(x − x̄)²
Divide by n
Take the square root

Example:

Value (x)	x − x̄	(x − x̄)²
12	−10.4	108.16
15	−7.4	54.76
18	−4.4	19.36
22	−0.4	0.16
25	2.6	6.76
30	7.6	57.76
35	12.6	158.76

Mean (x̄) = 22.4. Sum of (x − x̄)² = 405.72. Divide by 7 = 57.96. Square root = 7.61

Exam Tip: You may be asked to calculate standard deviation in the exam. Always set up a table like the one above — it keeps your working clear and helps you avoid errors. The examiner can award method marks even if your final answer is wrong.

Spearman's Rank Correlation Coefficient

This test determines whether there is a statistically significant relationship between two sets of ranked data.

Formula:

rs = 1 − (6Σd² / n(n² − 1))

where d = difference between the two ranks for each item, n = number of items

Steps:

Rank both sets of data (1 = highest or lowest — be consistent)
Calculate the difference between the ranks (d) for each pair
Square each difference (d²)
Sum all d² values
Substitute into the formula

Interpreting the result:

rs value	Interpretation
+1.0	Perfect positive correlation
+0.5 to +1.0	Strong positive correlation
0 to +0.5	Weak positive correlation
0	No correlation
0 to −0.5	Weak negative correlation
−0.5 to −1.0	Strong negative correlation
−1.0	Perfect negative correlation

You then compare your rs value against the critical value for your sample size at the 0.05 significance level. If your rs exceeds the critical value, the result is statistically significant (i.e. unlikely to have occurred by chance).

Exam Tip: You will be given the formula and a table of critical values in the exam. Focus on understanding the steps and interpreting the result — do not try to memorise the formula.

Chi-Squared Test

The chi-squared test checks whether there is a significant difference between observed and expected frequencies.

Formula:

χ² = Σ((O − E)² / E)

where O = observed frequency, E = expected frequency

Statistical Skills: Mean, Median, Mode & Quartiles

Statistical Skills: Mean, Median, Mode & Quartiles

Measures of Central Tendency

Mean

Median

Mode

Measures of Dispersion

Range

Interquartile Range (IQR)

Percentiles and Quartiles

Standard Deviation

Spearman's Rank Correlation Coefficient

Chi-Squared Test

More in Geography