Conditional Probability

This lesson covers conditional probability at A-Level, extending the probability concepts from earlier lessons. Conditional probability is the probability of an event occurring given that another event has already occurred. It is essential for solving problems involving dependent events and for understanding Bayes' theorem.

Definition of Conditional Probability

The conditional probability of event $A$ given event $B$ has occurred is:

$P(A | B) = \frac{P(A \cap B)}{P(B)} \quad \text{provided } P(B) > 0$

This can be rearranged to give the multiplication rule:

$P(A \cap B) = P(A | B) \times P(B)$

Example: In a group of 100 students, 60 study maths and 30 study both maths and physics. What is the probability a student studies physics given they study maths?

$P(\text{Physics} | \text{Maths}) = \frac{P(\text{Maths} \cap \text{Physics})}{P(\text{Maths})} = \frac{30/100}{60/100} = \frac{30}{60} = 0.5$

Exam Tip: The notation $P(A|B)$ reads as "the probability of $A$ given $B$ ". The event after the vertical bar is the condition — it is what you know has happened. Always identify clearly which event is the condition.

Two-Way Tables and Conditional Probability

Two-way tables (contingency tables) are useful for organising data and finding conditional probabilities.

Example:

	Pass	Fail	Total
Male	40	10	50
Female	35	15	50
Total	75	25	100

$P(\text{Pass} | \text{Female}) = \frac{35}{50} = 0.7$

$P(\text{Male} | \text{Fail}) = \frac{10}{25} = 0.4$

Tree Diagrams and Conditional Probability

Tree diagrams naturally incorporate conditional probabilities. The second set of branches shows $P(\text{second event} | \text{first event})$ .

Reading probabilities from a tree:

Along a branch: multiply probabilities (gives $P(A \cap B)$ ).
Across branches: add probabilities (gives $P(A)$ via the total probability rule).

Independence Revisited

Events $A$ and $B$ are independent if and only if:

$P(A | B) = P(A)$

Equivalently: $P(A \cap B) = P(A) \times P(B)$ .

If $P(A | B) \neq P(A)$ , the events are dependent.

The Law of Total Probability

If events $B_1, B_2, \ldots, B_n$ are mutually exclusive and exhaustive (they cover the entire sample space), then:

$P(A) = \sum_{i=1}^{n} P(A | B_i) \times P(B_i)$

Example: A factory has three machines producing items. Machine 1 produces 50% of items with 2% defect rate, Machine 2 produces 30% with 3% defect rate, Machine 3 produces 20% with 5% defect rate.

$P(\text{Defective}) = 0.5 \times 0.02 + 0.3 \times 0.03 + 0.2 \times 0.05 = 0.01 + 0.009 + 0.01 = 0.029$

Bayes' Theorem (Extension)

Bayes' theorem allows us to "reverse" conditional probabilities:

$P(B | A) = \frac{P(A | B) \times P(B)}{P(A)}$

Using the factory example above:

$P(\text{Machine 3} | \text{Defective}) = \frac{P(\text{Defective} | \text{Machine 3}) \times P(\text{Machine 3})}{P(\text{Defective})} = \frac{0.05 \times 0.2}{0.029} = \frac{0.01}{0.029} \approx 0.345$

Summary

$P(A | B) = \frac{P(A \cap B)}{P(B)}$ — the probability of $A$ given $B$ .
The multiplication rule: $P(A \cap B) = P(A | B) \times P(B)$ .
Events are independent if $P(A | B) = P(A)$ .
Use two-way tables and tree diagrams to organise conditional probability calculations.
The law of total probability combines conditional probabilities across all possible conditions.
Bayes' theorem reverses conditional probabilities.

Exam Tip: When solving conditional probability problems, always write down $P(A \cap B)$ and $P(B)$ separately before dividing. This makes your working clear and ensures you gain method marks.

A-Level Deep Dive: Conditional Probability

Spec mapping

AQA 7357 specification, Paper 3 — Statistics, Section Q (Probability), Year 2 content covers conditional probability, including the use of tree diagrams, Venn diagrams, two-way tables. Understand and use the conditional probability formula $P(A|B) = P(A \cap B)/P(B)$ . Modelling with probability, including critiquing assumptions made and the likely effect of more realistic assumptions (refer to the official specification document for exact wording). This sub-strand sits at the heart of Year 2 statistics and pulls together the Year 1 probability axioms, the multiplication rule, and the language of independence. Conditional probability is a high-stakes Paper 3 topic because it is also a synoptic gateway: examiners use it to test whether candidates can interpret a real-world claim about evidence (a medical test, a screening procedure, a quality-control result) and reason carefully about what the conditioning event actually is. The AQA formula booklet does not list the conditional probability formula in a directly usable form — $P(A|B) = P(A \cap B)/P(B)$ must be memorised, along with its rearrangements (the multiplication rule $P(A \cap B) = P(A|B)P(B)$ ) and the form of Bayes' theorem.

Worked example with full mark scheme

Question (8 marks):

A screening test for a rare disease has the following properties. The disease has prevalence $0.5\%$ in the screened population (i.e. $P(D) = 0.005$ ). The test has sensitivity $0.98$ (i.e. $P(\text{positive}|D) = 0.98$ ) and specificity $0.95$ (i.e. $P(\text{negative}|D') = 0.95$ ).

(a) A randomly selected person tests positive. Find the probability that they actually have the disease, giving your answer to 3 significant figures. (6)

(b) Comment on the appropriateness of using this screening test as the sole basis for a clinical diagnosis. (2)

Solution with mark scheme:

(a) Step 1 — identify the events and required quantities.

Let $D$ = "person has the disease" and $T$ = "person tests positive". We are given $P(D) = 0.005$ , $P(T|D) = 0.98$ , $P(T'|D') = 0.95$ . We require $P(D|T)$ .

M1 — correct identification of the conditioning event (we want $P(D|T)$ , not $P(T|D)$ ). This single mark is the gateway to the entire question. Candidates who confuse $P(A|B)$ with $P(B|A)$ — the "prosecutor's fallacy" — will compute the wrong quantity throughout and lose every subsequent mark.

Step 2 — derive complementary probabilities.

$P(D') = 1 - 0.005 = 0.995$ and $P(T|D') = 1 - 0.95 = 0.05$ (the false-positive rate).

B1 — both complements correct.

Step 3 — apply the total probability theorem to find $P(T)$ .

$P(T) = P(T|D)P(D) + P(T|D')P(D') = 0.98 \times 0.005 + 0.05 \times 0.995$ $= 0.00490 + 0.04975 = 0.05465$

M1 — correct partition of $P(T)$ across the disease status. A1 — accurate value $0.05465$ (or $0.0547$ to 3 s.f.).

Step 4 — apply Bayes' theorem.

$P(D|T) = \dfrac{P(T|D)P(D)}{P(T)} = \dfrac{0.98 \times 0.005}{0.05465} = \dfrac{0.00490}{0.05465}$

M1 — correct application of Bayes' theorem with the right numerator and denominator.

A1 — final answer $P(D|T) \approx 0.0897$ to 3 s.f.

(b) B1 — observation that despite the positive test result, the probability of actually having the disease is only about $8.97\%$ , because the disease is rare and false positives outnumber true positives by roughly ten to one in absolute terms.

B1 — appropriate critique: a positive result on this screening test should not be the sole basis for a diagnosis; a confirmatory follow-up test is needed. Equivalent points (e.g. discussion of the dependence of $P(D|T)$ on the prior $P(D)$ , or that screening is most effective in higher-prevalence sub-populations) earn the mark.

Total: 8 marks (M3 A2 B3, split as shown).

Specimen question modelled on the AQA 7357 Paper 3 format

Question (6 marks): Events $A$ and $B$ satisfy $P(A) = 0.4$ , $P(B) = 0.5$ and $P(A \cup B) = 0.7$ .

(a) Find $P(A \cap B)$ . (2)

(b) Find $P(A|B)$ . (2)

Mark scheme decomposition by AO:

(a)

M1 (AO1.1a) — applying the addition rule $P(A \cup B) = P(A) + P(B) - P(A \cap B)$ , so $0.7 = 0.4 + 0.5 - P(A \cap B)$ .
A1 (AO1.1b) — $P(A \cap B) = 0.2$ .

(b)

M1 (AO1.1b) — applying $P(A|B) = P(A \cap B)/P(B) = 0.2/0.5$ .
A1 (AO1.1b) — $P(A|B) = 0.4$ .

(c)

M1 (AO2.1) — comparing $P(A|B)$ with $P(A)$ , or equivalently checking whether $P(A \cap B) = P(A)P(B)$ .
A1 (AO2.4) — concluding that since $P(A|B) = 0.4 = P(A)$ , the events are independent, with the justification phrased in terms of "knowing $B$ has occurred does not change the probability of $A$ ".

Total: 6 marks split AO1 = 4, AO2 = 2. Conditional probability questions on AQA 7357 use AO2 marks specifically to test whether candidates can interpret a numerical equality as a statement about independence — the reasoning step matters as much as the arithmetic.

Synoptic links

Connects to:

Pure Mathematics 1 — Probability foundations: the axioms $0 \le P(A) \le 1$ , $P(\Omega) = 1$ , and the addition rule for mutually exclusive events all underpin conditional probability. The conditional measure $P(\cdot | B)$ is itself a probability measure on the restricted sample space $B$ , satisfying the same axioms — a structural insight that becomes formal in undergraduate measure theory.
Statistics — Binomial distribution: the binomial assumes independence between trials. Conditional probability is the language used to check this assumption: trials are independent iff $P(\text{success on trial } n | \text{outcome of earlier trials}) = P(\text{success on trial } n)$ . When sampling without replacement, this fails — and the binomial model is no longer appropriate (the hypergeometric distribution applies instead).
Statistics — Normal distribution: conditioning a bivariate normal on one component yields another normal distribution, with mean and variance modified by the correlation. While the explicit formula is not on the A-Level syllabus, the idea that conditioning narrows uncertainty appears informally whenever students reason about "given that the height is above 180 cm, what is the conditional distribution of weight?".
Statistics — Hypothesis testing: the $p$ -value is itself a conditional probability — $P(\text{observed data or more extreme} | H_0 \text{ true})$ . Confusing this with $P(H_0 \text{ true} | \text{data})$ is the most common interpretive error in introductory inference, and is exactly the same fallacy as confusing $P(D|T)$ with $P(T|D)$ in the worked example above.
Mechanics — Modelling: assumptions about independence (e.g. that two collisions are independent events) are critiqued in the same way conditional probability questions critique screening assumptions. The methodological habit of asking "is the independence assumption realistic?" transfers across modules.

Mark-scheme literacy

Conditional probability questions on AQA 7357 Paper 3 split AO marks across all three categories more evenly than most Pure topics:

AO	Typical share	Earned by
AO1 (knowledge / procedure)	40–55%	Applying the conditional probability formula, drawing tree diagrams, computing intersections, using the multiplication rule
AO2 (reasoning / interpretation)	25–40%	Identifying which conditional is required, justifying independence claims, interpreting the meaning of a numerical answer in context
AO3 (problem-solving / modelling)	15–25%	Critiquing modelling assumptions, recognising when sample-space restriction is appropriate, evaluating real-world implications

Examiner-rewarded phrasing: "since the events are independent, $P(A \cap B) = P(A) \times P(B)$ "; "conditioning on $B$ restricts the sample space to outcomes where $B$ occurs"; "by the total probability theorem, partitioning over the disease-status events". Phrases that lose marks: "the probability of having the disease given a positive test is $0.98$ " (this confuses $P(D|T)$ with $P(T|D)$ ); "since $A$ and $B$ are mutually exclusive, they are independent" (mutually exclusive events with non-zero probability are never independent — the opposite is true).

A specific AQA pattern to watch: questions that introduce a tree diagram and then ask for a probability conditional on a "downstream" event — i.e. you are asked for $P(\text{first branch} | \text{second branch outcome})$ . These require Bayes-style reasoning even when the word "Bayes" never appears. Read every "given that" clause as an instruction to reverse the natural direction of the tree.

Grade-band model answers

3-mark question

Question: Events $A$ and $B$ satisfy $P(A) = 0.3$ , $P(B) = 0.6$ and $P(A \cap B) = 0.18$ . Find $P(A|B)$ and state whether $A$ and $B$ are independent.

Grade C response (~180 words):

Using the formula:

$P(A|B) = \dfrac{P(A \cap B)}{P(B)} = \dfrac{0.18}{0.6} = 0.3$ .

Since $P(A|B) = P(A) = 0.3$ , the events are independent.

Examiner commentary: Full marks (3/3). The candidate quotes the formula correctly, computes the conditional probability accurately, and identifies the independence condition by comparing $P(A|B)$ with $P(A)$ . The answer is brief but every step is verifiable. This is the standard Grade C response for a procedural conditional probability question — efficient and correct, with the independence justification stated clearly. Some candidates lose a mark by computing $P(A|B)$ correctly but failing to justify the independence claim (just writing "yes" without comparing to $P(A)$ ).

Grade A response (~220 words):*

Applying the conditional probability formula:

$P(A|B) = \dfrac{P(A \cap B)}{P(B)} = \dfrac{0.18}{0.6} = 0.3$

To check independence, compare $P(A|B)$ with $P(A)$ : we have $P(A|B) = 0.3 = P(A)$ .

Equivalently, $P(A) \times P(B) = 0.3 \times 0.6 = 0.18 = P(A \cap B)$ , confirming the multiplicative independence condition.

Therefore $A$ and $B$ are independent: knowing that $B$ has occurred does not change the probability of $A$ .

Conditional Probability

Conditional Probability

Definition of Conditional Probability

Two-Way Tables and Conditional Probability

Tree Diagrams and Conditional Probability

Independence Revisited

The Law of Total Probability

Bayes' Theorem (Extension)

Summary

A-Level Deep Dive: Conditional Probability

Spec mapping

Worked example with full mark scheme

Specimen question modelled on the AQA 7357 Paper 3 format

Synoptic links

Mark-scheme literacy

Grade-band model answers

3-mark question

More in Mathematics