Approximations: Binomial, Poisson, and Normal

In Further Statistics you must know when one distribution can stand in for another, how to set up the approximating distribution, and why the conditions matter. These approximations turn intractable binomial sums into one-line calculations. This lesson covers the three examined approximations — Poisson-to-binomial, normal-to-binomial and normal-to-Poisson — together with the all-important continuity correction.

1. Where this sits in AQA 7367

This is Paper 3 Statistics option (7367/3S) content (AO weighting AO1 40% / AO2 25% / AO3 35%). Selecting the correct approximation and justifying it against the conditions is squarely AO2/AO3; the calculation itself is AO1. The prerequisites are the binomial distribution $B(n,p)$ and the normal distribution $N(\mu,\sigma^2)$ with standardisation $Z = \frac{X-\mu}{\sigma}$ , both from A-Level Mathematics, plus the Poisson distribution from the previous lesson.

2. Core theory

(a) Poisson approximation to the binomial

If $X \sim B(n,p)$ with $n$ large and $p$ small (a common rule of thumb is $n > 50,\ np < 5$ ), then

$X \approx \text{Po}(\lambda), \qquad \lambda = np.$

	Binomial	Poisson approximation
Mean	$np$	$\lambda = np$
Variance	$np(1-p)$	$\lambda = np$

The means match exactly. The variances match because, when $p$ is small, $1-p \approx 1$ , so $np(1-p) \approx np$ . The approximation is exact in the limit $n\to\infty,\ p\to 0$ with $np = \lambda$ fixed (proved in the Poisson lesson). No continuity correction is needed — both distributions are discrete on $\{0,1,2,\ldots\}$ .

(b) Normal approximation to the binomial

If $X \sim B(n,p)$ with $np > 5$ and $n(1-p) > 5$ (so $p$ is not extreme and $n$ is large), then

$X \approx N\big(np,\ np(1-p)\big).$

Here a continuity correction is essential because a discrete variable is being replaced by a continuous one.

(c) Normal approximation to the Poisson

If $X \sim \text{Po}(\lambda)$ with $\lambda$ large (rule of thumb $\lambda > 15$ ), then

$X \approx N(\lambda,\ \lambda),$

again with a continuity correction. (This follows by chaining: a large- $\lambda$ Poisson is a limit of binomials, which tend to normal.)

The continuity correction

A discrete integer $k$ "occupies" the interval $(k-0.5,\ k+0.5)$ under the continuous approximation. Translate the inequality accordingly:

Binomial / Poisson probability	Normal approximation
$P(X = k)$	$P(k-0.5 < Y < k+0.5)$
$P(X \leq k)$	$P(Y < k+0.5)$
$P(X < k)$	$P(Y < k-0.5)$
$P(X \geq k)$	$P(Y > k-0.5)$
$P(X > k)$	$P(Y > k+0.5)$

The reliable mental rule: expand the region to include the half-integers belonging to the integers you want. " $\geq 25$ " keeps 25, so go down to $24.5$ ; " $> 25$ " excludes 25, so go up to $25.5$ .

Why these conditions, and how good is the approximation?

It is worth understanding why each rule of thumb exists, because the exam can ask you to justify a choice or to comment on reliability.

The Poisson approximation to the binomial needs $p$ small for a structural reason: the Poisson has equal mean and variance, whereas the binomial has variance $np(1-p)$ . These agree only when $1-p \approx 1$ , i.e. $p$ near 0. If $p$ were, say, $0.4$ , the binomial variance $np(0.6)$ would be far below its mean $np$ , and forcing a Poisson (mean = variance) on it would badly misstate the spread. The large- $n$ requirement ensures there are "many opportunities" for the rare event, so the discrete count has room to spread.

The normal approximation needs both $np > 5$ and $n(1-p) > 5$ so that the binomial is not too skewed. A binomial with small $np$ piles probability against the lower boundary at 0 (a long right tail); a binomial with small $n(1-p)$ piles against the upper boundary at $n$ . The normal curve is symmetric, so it only fits well when the binomial is roughly symmetric — which happens when both tails have room, i.e. both $np$ and $n(1-p)$ are comfortably above a small threshold. The approximation is best near $p = 0.5$ (perfect symmetry) and converges fastest there.

How accurate are these approximations? Consider $X \sim B(50, 0.1)$ , where $np = 5$ sits on the Poisson boundary. The exact value $P(X = 5) = \binom{50}{5}(0.1)^5(0.9)^{45}$ is approximately $0.1849$ ; the Poisson approximation $\text{Po}(5)$ gives $\frac{e^{-5}5^5}{5!} = 0.1755$ — within about 5%, acceptable for many purposes but visibly imperfect because $p = 0.1$ is not that small. By contrast $X \sim B(1000, 0.003)$ (with $np = 3$ , $p$ tiny) matches $\text{Po}(3)$ to three decimal places at every value. The lesson is general: an approximation is most trustworthy near the centre of the distribution and when its conditions are comfortably (not marginally) satisfied, and least trustworthy in the far tails. If a question asks you to estimate a very small tail probability, flag that the approximation may be unreliable there — a perceptive AO2/AO3 remark.

A decision walkthrough across one scenario

To see how the choice of approximation depends entirely on the numbers, consider the same underlying situation examined at three different scales. A manufacturing line produces items with a defect probability $p$ , and a sample of $n$ is inspected; let $X$ be the number of defectives.

Scale 1: $n = 500,\ p = 0.004$ . Here $n$ is large and $p$ is tiny, with $np = 2$ moderate. The conditions for the Poisson approximation are comfortably met, so $X \approx \text{Po}(2)$ . A normal approximation would be poor because $np = 2 < 5$ : the distribution is strongly right-skewed (it cannot go below 0 but has a long upper tail), and the symmetric normal would misfit. Choose Poisson, $\lambda = 2$ , no continuity correction.
Scale 2: $n = 500,\ p = 0.5$ . Now $p$ is far from 0, so Poisson is ruled out (its mean-equals-variance property would clash badly with $np(1-p) = 125$ versus $np = 250$ ). But $np = 250 > 5$ and $n(1-p) = 250 > 5$ , so the normal approximation applies: $X \approx N(250, 125)$ . Choose normal, with continuity correction.
Scale 3: $n = 12,\ p = 0.3$ . Here $n$ is small and $np = 3.6 < 5$ , while $p = 0.3$ is not small either. Neither set of conditions is satisfied, so the honest answer is to use the exact binomial — for instance $P(X \leq 2) = \sum_{r=0}^{2}\binom{12}{r}(0.3)^r(0.7)^{12-r}$ , computed directly. Forcing an approximation here would be wrong, and saying so earns credit.

The single most useful exam discipline is therefore to write down the numbers — $np$ , $n(1-p)$ , the size of $p$ — before committing to a method, and to be willing to conclude "no approximation is appropriate; use the exact distribution" when that is the truth. Examiners deliberately set borderline cases to test exactly this judgement.

A worked normal approximation with interpretation

Putting the method together on a full problem: a fair die is rolled 180 times; estimate the probability of obtaining a six on at least 35 occasions. The number of sixes is $X \sim B(180, \tfrac16)$ , with $np = 30$ and $n(1-p) = 150$ , both far above 5, so a normal approximation is well justified: $Y \sim N(30,\ 25)$ , since $np(1-p) = 180\cdot\tfrac16\cdot\tfrac56 = 25$ and hence $\sigma = 5$ . The event "at least 35" includes 35, so the continuity correction takes the boundary down to $34.5$ :

$P(X \geq 35) \approx P(Y > 34.5) = P\!\left(Z > \frac{34.5 - 30}{5}\right) = P(Z > 0.9) = 1 - \Phi(0.9) = 1 - 0.8159 = 0.1841.$

So there is roughly an 18% chance of getting 35 or more sixes — noticeably more than the "expected" 30, but not extraordinary, sitting at $0.9$ standard deviations above the mean. Two interpretive remarks elevate such an answer. First, the approximation is trustworthy here: $p = \tfrac16$ keeps the binomial only mildly skewed and the requested value lies near the centre, not deep in a tail. Second, had the question instead asked for "at least 60 sixes" — about $6$ standard deviations out — the normal approximation in the extreme tail would be far less reliable, and one would treat the tiny estimate with caution. Quoting the $Z$ -value and commenting on the quality of the fit turns a mechanical calculation into a top-band response.

Approximating a Poisson, and the two-step chain

The normal approximation to the Poisson is handled identically — same continuity correction, variance equal to $\lambda$ in the bracket — and a worked example cements the pattern. The number of particles detected by a counter in one minute follows $\text{Po}(36)$ ; estimate the probability of detecting between 30 and 40 particles inclusive. Since $\lambda = 36 > 15$ , use $Y \sim N(36, 36)$ with $\sigma = 6$ . The inclusive range $30 \leq X \leq 40$ becomes, after continuity correction, $P(29.5 < Y < 40.5)$ :

$Z_1 = \frac{29.5 - 36}{6} = -1.0833, \qquad Z_2 = \frac{40.5 - 36}{6} = 0.75,$

$P(30 \leq X \leq 40) \approx \Phi(0.75) - \Phi(-1.0833) = 0.7734 - (1 - 0.8607) = 0.7734 - 0.1393 = 0.6341.$

Two ideas are worth drawing out. First, the variance going into the normal is $\lambda = 36$ , not $\sqrt{36}$ ; the standard deviation $\sigma = \sqrt{36} = 6$ only appears when standardising. Second, this approximation is itself a consequence of the binomial story: a large- $\lambda$ Poisson is a sum of many independent small- $\lambda$ Poissons (by the additivity result), and sums of many independent contributions tend to normal by the Central Limit Theorem. In some problems both approximations chain together — a binomial $B(n, p)$ with very large $n$ and small $p$ can be taken to $\text{Po}(np)$ , and if $np$ is itself large, on to $N(np, np)$ . Recognising which single step (or chain) a given set of numbers calls for is the whole art of this lesson.

Seeing the error directly

It is illuminating to put an exact value beside its approximation and read off the discrepancy. Take $X \sim B(20, 0.1)$ , a case with $np = 2$ where Poisson is the recommended approximation but $n = 20$ is only modestly large. The exact probability of no successes is $P(X = 0) = 0.9^{20} = 0.1216$ , while $\text{Po}(2)$ gives $e^{-2} = 0.1353$ — an absolute error of about $0.014$ , or roughly 11%. For $P(X = 1)$ : exact $20(0.1)(0.9^{19}) = 0.2702$ versus $2e^{-2} = 0.2707$ , now within $0.0005$ . The approximation is excellent in the body and weakest at $X = 0$ , exactly where the binomial's boundedness (it can never go negative) differs most from the Poisson's idealised tail.

Compare this with the genuinely tiny- $p$ case $B(2000, 0.001)$ , also with $np = 2$ : here the exact and Poisson values agree to four decimal places everywhere, because $p = 0.001$ is so small that $1 - p \approx 1$ to high precision and the binomial variance $np(1-p) = 1.998$ is indistinguishable from the Poisson variance $2$ . The moral, repeated because it is so often examined: the same $np$ can give an excellent or a merely-adequate Poisson approximation depending on how small $p$ actually is. When a question supplies the exact figures, a one-line comparison ("the Poisson estimate $0.135$ is within about 11% of the exact $0.122$ ") demonstrates real understanding and is the kind of evaluative comment that secures AO3 marks.

3. Worked examples with M1/A1 mark schemes

Example 1 — Poisson approximation to a binomial

A component has a 2% defect rate. In a batch of 200, find $P(\text{exactly }3\text{ defective})$ .

$X \sim B(200, 0.02)$ . Since $n = 200$ is large and $p = 0.02$ is small, use $\text{Po}(\lambda)$ with $\lambda = np = 4$ . (B1 state and justify the approximation; M1 $\lambda = np = 4$ .)

$P(X = 3) \approx \frac{e^{-4}\times 4^3}{3!} = \frac{0.018316 \times 64}{6} = 0.1954.$ (M1 apply Poisson formula; A1 $0.1954$ .)

Approximations: Binomial, Poisson, and Normal

Approximations: Binomial, Poisson, and Normal

1. Where this sits in AQA 7367

2. Core theory

(a) Poisson approximation to the binomial

(b) Normal approximation to the binomial

(c) Normal approximation to the Poisson

The continuity correction

Why these conditions, and how good is the approximation?

A decision walkthrough across one scenario

A worked normal approximation with interpretation

Approximating a Poisson, and the two-step chain

Seeing the error directly

3. Worked examples with M1/A1 mark schemes

Example 1 — Poisson approximation to a binomial

More in Mathematics