You are viewing a free preview of this lesson.
Subscribe to unlock all 13 lessons in this course and every other course on LearningBro.
Statistics is a powerful tool in geographical investigation. Rather than relying on subjective observation — "the pebbles seem to get smaller downstream" — statistical techniques allow you to measure, describe and test relationships in your data with mathematical precision. In the Edexcel B exam, you need to know how to calculate and interpret several key statistical measures, including the Spearman's rank correlation coefficient.
This lesson covers the statistical techniques required for the specification, from basic measures of central tendency through to significance testing.
Central tendency tells you the typical or average value in a dataset. There are three measures you need to know:
The mean is the arithmetic average. Add up all the values and divide by the number of values.
Formula: Mean = Sum of all values / Number of values
Example: River velocities at a site: 0.3, 0.4, 0.5, 0.4, 0.6 m/s
Mean = (0.3 + 0.4 + 0.5 + 0.4 + 0.6) / 5 = 2.2 / 5 = 0.44 m/s
Advantages: Uses all the data; gives a precise single value. Limitations: Can be distorted by extreme values (outliers). If one reading was 2.5 m/s (an error), the mean would jump to 0.84 m/s, which is unrepresentative.
The median is the middle value when all data is arranged in order from smallest to largest.
Method:
Example: Pebble sizes (mm): 12, 18, 24, 31, 45, 52, 88
Median = 31 mm (the 4th value in a set of 7)
Advantages: Not affected by extreme values; easy to find. Limitations: Does not use all the data; can be unrepresentative if data is clustered at one end.
The mode is the most frequently occurring value in a dataset.
Example: Environmental quality scores: 3, 4, 4, 4, 5, 5, 6, 7
Mode = 4 (appears 3 times)
Advantages: Shows the most common value; useful for categorical data. Limitations: There may be no mode (all values different) or multiple modes; does not use all the data.
| Measure | Best For | Limitation |
|---|---|---|
| Mean | Normally distributed data without outliers | Distorted by extreme values |
| Median | Skewed data or data with outliers | Does not use all values |
| Mode | Categorical data or identifying the most common value | May not exist or may be misleading |
Exam Tip: If asked which measure of central tendency is most appropriate, consider whether there are outliers in the data. If there are, the median is usually better than the mean because it is not distorted by extreme values. Always explain your choice.
Dispersion (or spread) tells you how spread out the data is around the central value.
The range is the simplest measure of spread.
Range = highest value - lowest value
Example: Pebble sizes: 12, 18, 24, 31, 45, 52, 88 mm
Range = 88 - 12 = 76 mm
Advantages: Very easy to calculate. Limitations: Only uses the two most extreme values, so a single outlier can make the range misleadingly large.
The interquartile range measures the spread of the middle 50% of the data, removing the influence of extreme values.
Method:
Example: Data set (already ordered): 5, 8, 12, 15, 18, 22, 25, 30, 35, 42, 48, 55
Advantages: Not affected by extreme values; more representative than the range. Limitations: More complex to calculate; ignores data outside the middle 50%.
| Measure | Uses | Sensitivity to Outliers |
|---|---|---|
| Range | Quick overview of spread | Very sensitive — one outlier changes it dramatically |
| IQR | Robust measure of spread | Not sensitive — ignores the top and bottom 25% |
The Spearman's rank correlation coefficient (rs) is the most important statistical test for GCSE Geography. It measures the strength and direction of a relationship between two variables using ranked data.
Use Spearman's rank when:
rs = 1 - (6 x Σd²) / (n³ - n)
Where:
Example: Testing whether pebble size decreases with distance downstream
Subscribe to continue reading
Get full access to this lesson and all 13 lessons in this course.