You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
In Further Statistics you must know when one distribution can stand in for another, how to set up the approximating distribution, and why the conditions matter. These approximations turn intractable binomial sums into one-line calculations. This lesson covers the three examined approximations — Poisson-to-binomial, normal-to-binomial and normal-to-Poisson — together with the all-important continuity correction.
This is Paper 3 Statistics option (7367/3S) content (AO weighting AO1 40% / AO2 25% / AO3 35%). Selecting the correct approximation and justifying it against the conditions is squarely AO2/AO3; the calculation itself is AO1. The prerequisites are the binomial distribution B(n,p) and the normal distribution N(μ,σ2) with standardisation Z=σX−μ, both from A-Level Mathematics, plus the Poisson distribution from the previous lesson.
If X∼B(n,p) with n large and p small (a common rule of thumb is n>50, np<5), then
X≈Po(λ),λ=np.
| Binomial | Poisson approximation | |
|---|---|---|
| Mean | np | λ=np |
| Variance | np(1−p) | λ=np |
The means match exactly. The variances match because, when p is small, 1−p≈1, so np(1−p)≈np. The approximation is exact in the limit n→∞, p→0 with np=λ fixed (proved in the Poisson lesson). No continuity correction is needed — both distributions are discrete on {0,1,2,…}.
If X∼B(n,p) with np>5 and n(1−p)>5 (so p is not extreme and n is large), then
X≈N(np, np(1−p)).
Here a continuity correction is essential because a discrete variable is being replaced by a continuous one.
If X∼Po(λ) with λ large (rule of thumb λ>15), then
X≈N(λ, λ),
again with a continuity correction. (This follows by chaining: a large-λ Poisson is a limit of binomials, which tend to normal.)
A discrete integer k "occupies" the interval (k−0.5, k+0.5) under the continuous approximation. Translate the inequality accordingly:
| Binomial / Poisson probability | Normal approximation |
|---|---|
| P(X=k) | P(k−0.5<Y<k+0.5) |
| P(X≤k) | P(Y<k+0.5) |
| P(X<k) | P(Y<k−0.5) |
| P(X≥k) | P(Y>k−0.5) |
| P(X>k) | P(Y>k+0.5) |
The reliable mental rule: expand the region to include the half-integers belonging to the integers you want. "≥25" keeps 25, so go down to 24.5; ">25" excludes 25, so go up to 25.5.
It is worth understanding why each rule of thumb exists, because the exam can ask you to justify a choice or to comment on reliability.
The Poisson approximation to the binomial needs p small for a structural reason: the Poisson has equal mean and variance, whereas the binomial has variance np(1−p). These agree only when 1−p≈1, i.e. p near 0. If p were, say, 0.4, the binomial variance np(0.6) would be far below its mean np, and forcing a Poisson (mean = variance) on it would badly misstate the spread. The large-n requirement ensures there are "many opportunities" for the rare event, so the discrete count has room to spread.
The normal approximation needs both np>5 and n(1−p)>5 so that the binomial is not too skewed. A binomial with small np piles probability against the lower boundary at 0 (a long right tail); a binomial with small n(1−p) piles against the upper boundary at n. The normal curve is symmetric, so it only fits well when the binomial is roughly symmetric — which happens when both tails have room, i.e. both np and n(1−p) are comfortably above a small threshold. The approximation is best near p=0.5 (perfect symmetry) and converges fastest there.
How accurate are these approximations? Consider X∼B(50,0.1), where np=5 sits on the Poisson boundary. The exact value P(X=5)=(550)(0.1)5(0.9)45 is approximately 0.1849; the Poisson approximation Po(5) gives 5!e−555=0.1755 — within about 5%, acceptable for many purposes but visibly imperfect because p=0.1 is not that small. By contrast X∼B(1000,0.003) (with np=3, p tiny) matches Po(3) to three decimal places at every value. The lesson is general: an approximation is most trustworthy near the centre of the distribution and when its conditions are comfortably (not marginally) satisfied, and least trustworthy in the far tails. If a question asks you to estimate a very small tail probability, flag that the approximation may be unreliable there — a perceptive AO2/AO3 remark.
To see how the choice of approximation depends entirely on the numbers, consider the same underlying situation examined at three different scales. A manufacturing line produces items with a defect probability p, and a sample of n is inspected; let X be the number of defectives.
The single most useful exam discipline is therefore to write down the numbers — np, n(1−p), the size of p — before committing to a method, and to be willing to conclude "no approximation is appropriate; use the exact distribution" when that is the truth. Examiners deliberately set borderline cases to test exactly this judgement.
Putting the method together on a full problem: a fair die is rolled 180 times; estimate the probability of obtaining a six on at least 35 occasions. The number of sixes is X∼B(180,61), with np=30 and n(1−p)=150, both far above 5, so a normal approximation is well justified: Y∼N(30, 25), since np(1−p)=180⋅61⋅65=25 and hence σ=5. The event "at least 35" includes 35, so the continuity correction takes the boundary down to 34.5:
P(X≥35)≈P(Y>34.5)=P(Z>534.5−30)=P(Z>0.9)=1−Φ(0.9)=1−0.8159=0.1841.
So there is roughly an 18% chance of getting 35 or more sixes — noticeably more than the "expected" 30, but not extraordinary, sitting at 0.9 standard deviations above the mean. Two interpretive remarks elevate such an answer. First, the approximation is trustworthy here: p=61 keeps the binomial only mildly skewed and the requested value lies near the centre, not deep in a tail. Second, had the question instead asked for "at least 60 sixes" — about 6 standard deviations out — the normal approximation in the extreme tail would be far less reliable, and one would treat the tiny estimate with caution. Quoting the Z-value and commenting on the quality of the fit turns a mechanical calculation into a top-band response.
The normal approximation to the Poisson is handled identically — same continuity correction, variance equal to λ in the bracket — and a worked example cements the pattern. The number of particles detected by a counter in one minute follows Po(36); estimate the probability of detecting between 30 and 40 particles inclusive. Since λ=36>15, use Y∼N(36,36) with σ=6. The inclusive range 30≤X≤40 becomes, after continuity correction, P(29.5<Y<40.5):
Z1=629.5−36=−1.0833,Z2=640.5−36=0.75,
P(30≤X≤40)≈Φ(0.75)−Φ(−1.0833)=0.7734−(1−0.8607)=0.7734−0.1393=0.6341.
Two ideas are worth drawing out. First, the variance going into the normal is λ=36, not 36; the standard deviation σ=36=6 only appears when standardising. Second, this approximation is itself a consequence of the binomial story: a large-λ Poisson is a sum of many independent small-λ Poissons (by the additivity result), and sums of many independent contributions tend to normal by the Central Limit Theorem. In some problems both approximations chain together — a binomial B(n,p) with very large n and small p can be taken to Po(np), and if np is itself large, on to N(np,np). Recognising which single step (or chain) a given set of numbers calls for is the whole art of this lesson.
It is illuminating to put an exact value beside its approximation and read off the discrepancy. Take X∼B(20,0.1), a case with np=2 where Poisson is the recommended approximation but n=20 is only modestly large. The exact probability of no successes is P(X=0)=0.920=0.1216, while Po(2) gives e−2=0.1353 — an absolute error of about 0.014, or roughly 11%. For P(X=1): exact 20(0.1)(0.919)=0.2702 versus 2e−2=0.2707, now within 0.0005. The approximation is excellent in the body and weakest at X=0, exactly where the binomial's boundedness (it can never go negative) differs most from the Poisson's idealised tail.
Compare this with the genuinely tiny-p case B(2000,0.001), also with np=2: here the exact and Poisson values agree to four decimal places everywhere, because p=0.001 is so small that 1−p≈1 to high precision and the binomial variance np(1−p)=1.998 is indistinguishable from the Poisson variance 2. The moral, repeated because it is so often examined: the same np can give an excellent or a merely-adequate Poisson approximation depending on how small p actually is. When a question supplies the exact figures, a one-line comparison ("the Poisson estimate 0.135 is within about 11% of the exact 0.122") demonstrates real understanding and is the kind of evaluative comment that secures AO3 marks.
A component has a 2% defect rate. In a batch of 200, find P(exactly 3 defective).
X∼B(200,0.02). Since n=200 is large and p=0.02 is small, use Po(λ) with λ=np=4. (B1 state and justify the approximation; M1 λ=np=4.)
P(X=3)≈3!e−4×43=60.018316×64=0.1954. (M1 apply Poisson formula; A1 0.1954.)
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.