You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
This lesson draws together two related ideas: using relative frequency to estimate a probability from data, and working with a probability distribution — a table that lists every outcome alongside its probability. The single unifying principle is that probabilities must add to 1. You will estimate probabilities from experiments, find a missing probability in a distribution, and use distributions to predict outcomes.
This is AO1 (calculating relative frequencies and missing probabilities), AO2 (judging the reliability of an estimate) and AO3 (setting up and solving for an unknown probability, sometimes algebraically). OCR command words include Work out, Calculate, Estimate, Show that and Find.
| Term | Definition |
|---|---|
| Relative frequency | An estimate of probability from data: total trialsfrequency. |
| Probability distribution | A table listing all outcomes and their probabilities. |
| Sum to 1 | The probabilities in a distribution always total 1. |
| Missing probability | An unknown probability found by subtracting the others from 1. |
| Expected frequency | A predicted count: P×n. |
| Reliable estimate | An estimate based on a large number of trials. |
When outcomes are not equally likely, we estimate a probability using relative frequency:
estimated probability=total number of trialsfrequency of the outcome
This is our best estimate from the data, and it becomes more reliable as the number of trials grows. The first practical step in many questions is to find the total number of trials by adding up all the frequencies in the table, because the total is the denominator for every relative frequency you calculate. Once you have it, each estimate is just that outcome's frequency divided by the total, and as a useful check the relative frequencies of all the outcomes should themselves add up to 1.
Relative frequency and probability distributions are two sides of one coin, which is why they sit together in this lesson. A relative frequency is a probability estimated from data; a probability distribution is a complete list of outcomes with their probabilities. The single rule that ties them together — and the rule you will lean on again and again — is that the probabilities of all the outcomes of an experiment must add up to exactly 1. Whether you are filling in a missing relative frequency or a missing probability in a distribution, you are really just using "everything sums to 1".
A spinner is spun 250 times and lands on blue 90 times. Estimate the probability of blue.
Solution: 25090=259=0.36
A six-sided dice is rolled 400 times and shows a 6 on 84 occasions. Estimate P(6) and comment on whether the dice may be biased.
Solution: 40084=0.21. A fair dice would give about 61≈0.167, so 0.21 is a little higher — there is some evidence of bias towards 6, but more trials would be needed to be confident.
Common error: Declaring the dice "biased" from a single moderate difference. Use cautious language.
Once you have an estimated probability, predict counts in a larger experiment with expected frequency =P×n.
A factory finds that, of 600 items, 24 are faulty. (a) Estimate the probability an item is faulty. (b) Estimate the number of faulty items in a batch of 5000.
Solution: (a) 60024=251=0.04. (b) 0.04×5000=200 faulty items.
A bus is recorded as late on 36 of 120 days. (a) Estimate the probability the bus is late. (b) Estimate the number of late days in a term of 75 school days.
Solution: (a) 12036=103=0.3. (b) 0.3×75=22.5, so about 22 or 23 days.
A probability distribution lists every possible outcome with its probability. Because the outcomes are exhaustive and mutually exclusive, the probabilities add up to 1.
Two conditions make a table a valid probability distribution. First, every probability must lie between 0 and 1, because no individual outcome can be more likely than certain or less likely than impossible. Second, the probabilities must total exactly 1, because the outcomes cover every possibility with no overlap, so collectively they are certain. If a table breaks either condition — a negative entry, an entry above 1, or a total that is not 1 — it is not a valid distribution. Checking these two conditions is often a question in its own right, usually phrased as "Show that this is a probability distribution".
A distribution is more than a list: it is a complete summary of how a random outcome behaves, and it lets you answer any probability question about that outcome. You can read a single probability straight off the table; you can find the probability of a group of outcomes (such as "an odd number" or "at least 4") by adding the relevant entries, because those outcomes are mutually exclusive; and you can predict how often an outcome occurs over many trials using the expected-frequency formula. Everything you have learned about mutually exclusive events and expected outcomes comes together when working with a distribution.
A biased spinner has this distribution.
| Score | 1 | 2 | 3 | 4 |
|---|---|---|---|---|
| Probability | 0.2 | 0.35 | 0.15 | 0.3 |
Show that this is a valid probability distribution.
Solution: 0.2+0.35+0.15+0.3=1
The probabilities sum to 1, so it is valid.
A spinner has this distribution.
| Colour | Red | Blue | Green | Yellow |
|---|---|---|---|---|
| Probability | 0.3 | 0.25 | 0.15 | ? |
Work out the missing probability.
Solution: P(yellow)=1−(0.3+0.25+0.15)=1−0.7=0.3
The probabilities in a distribution sum to 1, so any single missing probability is 1 minus the sum of the rest. When more than one value is unknown, you usually need extra information or algebra.
The "one unknown" case is the most common and the simplest: add up all the probabilities you are given, then subtract that total from 1. The result is the missing probability. It is worth a moment's sense-check that your answer is itself a valid probability between 0 and 1 — if it comes out negative or above 1, one of the given values has been mis-read.
The "two or more unknowns" case calls for algebra, and OCR sets these regularly. If two outcomes are given as expressions such as x and 2x, or 3k and 2k, you cannot find them by a single subtraction; instead you form the equation "(sum of all the probabilities) =1", collect the like terms, and solve. This turns a probability question into a routine linear equation, and writing down that equation explicitly is what earns the method mark. Once you have the value of the unknown letter, substitute back to state each individual probability.
A biased dice has P(1)=0.1, P(2)=0.15, P(3)=0.2, P(4)=0.2, P(5)=0.15. Work out P(6).
Solution: P(6)=1−(0.1+0.15+0.2+0.2+0.15)=1−0.8=0.2
A four-sided spinner has P(1)=x, P(2)=x, P(3)=0.3 and P(4)=0.1. Work out the value of x.
Solution: x+x+0.3+0.1=1⟹2x+0.4=1⟹2x=0.6⟹x=0.3
A spinner has P(red)=3k, P(blue)=2k, P(green)=k and P(white)=0.1. Show that k=0.15.
Solution: 3k+2k+k+0.1=1⟹6k=0.9⟹k=0.15
This shows k=0.15 as required, so P(red)=0.45, P(blue)=0.3, P(green)=0.15.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.