You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
This lesson covers scatter graphs — how to plot them, identify the type of correlation, draw a line of best fit, and use it to make predictions. Scatter graphs are tested frequently on the AQA GCSE Mathematics paper and are a key topic for understanding the relationship between two variables.
A scatter graph (also called a scatter diagram or scatter plot) is used to show the relationship between two variables. Each point on the graph represents one piece of data with two values — one plotted on the x-axis and one on the y-axis.
Scatter graphs are used when you want to investigate whether there is a relationship (correlation) between two quantitative variables. For example:
A teacher recorded the number of hours of revision and the test score for 10 students.
| Hours of Revision | Test Score (%) |
|---|---|
| 2 | 35 |
| 3 | 42 |
| 4 | 50 |
| 5 | 55 |
| 5 | 60 |
| 6 | 62 |
| 7 | 70 |
| 8 | 75 |
| 9 | 82 |
| 10 | 90 |
To plot the scatter graph:
Exam Tip: Always use a cross (x) or a small dot to plot points — not large circles. Make sure each point is plotted accurately. Examiners check individual points and can deduct marks for inaccurate plotting.
Correlation describes the relationship between two variables on a scatter graph.
| Type | Description | What It Looks Like |
|---|---|---|
| Positive correlation | As one variable increases, the other also increases | Points slope upwards from left to right |
| Negative correlation | As one variable increases, the other decreases | Points slope downwards from left to right |
| No correlation | There is no clear relationship between the variables | Points are scattered randomly |
graph LR
A[Correlation Types] --> B[Positive]
A --> C[Negative]
A --> D[No Correlation]
B --> B1[Both variables increase together]
C --> C1[One increases while other decreases]
D --> D1[No clear pattern or relationship]
Correlation can also be described by its strength:
Exam Tip: When describing correlation, you must state both the type (positive, negative, or none) and the strength (strong or weak). Writing "positive correlation" is good, but "strong positive correlation" is better and shows fuller understanding.
A line of best fit is a straight line drawn through the data points on a scatter graph that best represents the overall trend. It does not have to pass through every point.
For the revision data above:
The mean point is (5.9, 62.1). Your line of best fit should pass through or very close to this point.
Exam Tip: In the exam, you may be asked to calculate and plot the mean point. Always mark it clearly with a labelled cross. Then draw your line of best fit so it passes through this point. Use a ruler for a straight line.
Once you have a line of best fit, you can use it to estimate the value of one variable given the other.
Interpolation means using the line of best fit to estimate a value within the range of the data. This is generally reliable because you are predicting within known data.
Using the revision data: How many marks would a student who revised for 6.5 hours expect to score?
Find 6.5 on the x-axis, draw a vertical line up to the line of best fit, then draw a horizontal line across to the y-axis. Read off the value (approximately 67%).
Extrapolation means using the line of best fit to estimate a value outside the range of the data. This is generally unreliable because the trend may not continue beyond the data range.
Using the revision data: How many marks would a student who revised for 15 hours expect to score?
This is extrapolation because 15 hours is outside the range of the data (2–10 hours). The estimate would be unreliable because we do not know if the pattern continues — it is unlikely that test scores keep rising indefinitely (a score above 100% is impossible).
graph TD
A[Using Line of Best Fit] --> B{Is the value within the data range?}
B -- Yes --> C[Interpolation]
B -- No --> D[Extrapolation]
C --> E[Generally reliable]
D --> F[Generally unreliable]
F --> G[The trend may not continue beyond the data]
Exam Tip: The difference between interpolation and extrapolation is a common exam question. If asked to comment on the reliability of an estimate, check whether it is within the data range. If it is outside the range, say the estimate is "unreliable because it involves extrapolation and the trend may not continue beyond the given data".
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.