You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
The cumulative distribution function (CDF), written F, is the single most versatile object attached to a random variable. Where the density f describes how probability is spread, the CDF describes how it accumulates: F(x) is the total probability sitting at or below x. From it you read off any probability without further integration, locate the median and quartiles by solving a single equation, recover the density by differentiation, and — crucially for 7367/3S — generate the distribution of a transformed variable Y=g(X). This lesson builds the CDF from the ground up and turns it into a complete exam toolkit.
This is Paper 3 Statistics option (7367/3S) content, sitting directly on the continuous-distribution thread (Lessons 4–5). The per-paper AO weighting for the option papers is AO1 40% / AO2 25% / AO3 35%, so the problem-solving (AO3) demand is higher here than in the compulsory pure papers. Routine work — integrating a density to a CDF, or differentiating a CDF back to a density — is AO1. Setting up and solving F(m)=0.5 for a median, or building the CDF of Y=X2 from first principles, is AO3. Justifying why F must be non-decreasing and continuous, or interpreting a percentile in context, is AO2. The prerequisites are A-Level Mathematics integration/differentiation and the PDF definition and validity condition from Lesson 4.
The CDF sits at the hub of the continuous-distribution lessons: Lesson 5 found summary measures by integrating the density, but many of those same quantities — medians, quartiles, percentiles, "more than"/"less than" probabilities — are found more directly through the CDF. For that reason this lesson is among the most heavily examined in 3S: a single question can ask you to derive F, read several probabilities from it, locate the median and quartiles, and then use the CDF to build the distribution of a transformed variable. Mastering the CDF therefore pays back across the whole option, and the methods here recur in Lesson 7 (the exponential CDF) and underpin the simulation idea touched on in the stretch section.
The cumulative distribution function of a random variable X is
F(x)=P(X≤x)=∫−∞xf(t)dt.
The dummy variable t inside the integral is deliberate: x is the upper limit, so it must not also be the variable of integration. Geometrically, F(x) is the area under the density to the left of x.
Any function that is to serve as a CDF of a continuous variable must satisfy:
| Property | Statement | Why |
|---|---|---|
| Limits | F(−∞)=0, F(+∞)=1 | no probability below the support; all probability accounted for |
| Monotonic | a<b⟹F(a)≤F(b) | F′(x)=f(x)≥0, so F never decreases |
| Continuity | F is continuous everywhere | a continuous variable has no point masses, so no jumps |
| Range | 0≤F(x)≤1 | it is a probability |
The monotonicity follows directly from the Fundamental Theorem of Calculus: since F′(x)=f(x) and a valid density is non-negative, F can only stay level or rise. A CDF that decreased anywhere would correspond to a negative density — impossible. These four properties are not arbitrary book-keeping; together they characterise exactly the functions that can serve as a CDF. If you are handed a candidate function and asked "could this be a cumulative distribution function?", you check the four: does it start at 0, end at 1, never decrease, and stay continuous? A function failing any one of them cannot be a CDF, and spotting the failure is a genuine exam skill — for instance, a function that reached 1.2 somewhere, or that dipped then rose, is immediately disqualified.
It is also worth contrasting the shape of the CDF with that of the density. The density f can do almost anything non-negative — rise, fall, have several humps, even exceed 1 (it is a density, not a probability). The CDF F, by contrast, is always a smooth, weakly-increasing S-shaped (or ramp-shaped) curve climbing from the floor 0 to the ceiling 1. Where the density is large, the CDF is steep; where the density is zero, the CDF is flat. This visual dictionary — "tall density ⇒ steep CDF" — is a fast sanity check on any answer.
The Fundamental Theorem of Calculus ties the two functions together:
F(x)=∫−∞xf(t)dt⟺f(x)=dxdF=F′(x).
In words: integrate the density to get the CDF; differentiate the CDF to get the density. This two-way street is the engine of almost every CDF question.
P(X≤a)=F(a),P(X>a)=1−F(a),P(a≤X≤b)=F(b)−F(a).
Because X is continuous, P(X=a)=0, so the inequalities may be strict or weak interchangeably: P(X<a)=P(X≤a). This is the single biggest convenience of the continuous CDF over the discrete case.
Exam Tip: If a question gives you the CDF and asks for a probability, do not re-integrate the density — just subtract two values of F. Re-integrating wastes time and invites slips.
Let f(x)=83x2 for 0≤x≤2, and zero otherwise. Find F(x) fully, then recover f by differentiating.
For x<0, nothing has accumulated, so F(x)=0. For 0≤x≤2,
F(x)=∫0x83t2dt=83[3t3]0x=8x3.(M1 integrate; A1 x3/8)
For x>2 all the probability lies below, so F(x)=1. Hence
F(x)=⎩⎨⎧08x31x<00≤x≤2x>2(A1 fully-defined piecewise CDF, all three branches)
Verification (always do this): F(0)=0 and F(2)=88=1; the join values match the limits, so F is continuous and rises from 0 to 1. Differentiating the middle branch recovers F′(x)=83x2=f(x) (B1), confirming consistency.
Given
F(x)=⎩⎨⎧04(x−1)21x<11≤x≤3x>3,
find the density.
f(x)=F′(x)=42(x−1)=2x−1,1≤x≤3,(M1 differentiate; A1)
and f(x)=0 elsewhere. Check it is a valid density:
∫132x−1dx=21[2(x−1)2]13=21⋅24=1.(B1)
Also f≥0 on [1,3], so this is a legitimate PDF.
With F(x)=8x3 on [0,2] from Example 1:
P(1≤X≤1.5)=F(1.5)−F(1)=81.53−813=83.375−1=82.375=0.296875.(M1 A1)
Median m: solve F(m)=0.5:
8m3=0.5⟹m3=4⟹m=34≈1.587.(M1 set F=0.5; A1)
Quartiles: 8Q13=0.25⇒Q1=32≈1.260; 8Q33=0.75⇒Q3=36≈1.817, so
IQR=Q3−Q1=1.817−1.260=0.557.(A1)
Densities supported on [0,∞) integrate to a CDF that approaches — never quite reaches — 1. Let f(x)=3e−3x for x≥0. Then for x≥0,
F(x)=∫0x3e−3tdt=[−e−3t]0x=1−e−3x,(M1 integrate; A1)
and F(x)=0 for x<0. As x→∞, e−3x→0 so F(x)→1 (the ceiling is a limit, reached only at infinity). The median solves 1−e−3m=0.5, i.e. e−3m=0.5, giving m=3ln2≈0.231 (M1 A1). Note the median is less than the mean E(X)=31≈0.333, the signature of the right-skewed exponential — a point we return to in Lesson 7.
(Specimen-style — not from any real paper.)
A continuous random variable X has probability density function f(x)=43x(2−x) for 0≤x≤2, and f(x)=0 otherwise. (a) Find the cumulative distribution function F(x). (b) Hence find P(0.5≤X≤1.5). (c) Find the median of X.
(a) For 0≤x≤2,
F(x)=∫0x43t(2−t)dt=43[t2−3t3]0x=43(x2−3x3)=43x2−4x3.
Check F(2)=43⋅4−48=3−2=1 ✓. So
F(x)=⎩⎨⎧043x2−x31x<00≤x≤2x>2.
(b) F(1.5)=43(2.25)−3.375=46.75−3.375=43.375=0.84375; F(0.5)=43(0.25)−0.125=40.75−0.125=40.625=0.15625. Hence
P(0.5≤X≤1.5)=0.84375−0.15625=0.6875.
(c) By the symmetry of f about x=1 the median is m=1; confirm F(1)=43−1=42=0.5 ✓.
The recurring theme across these links is the duality of f and F: differentiation and integration carry you back and forth, and almost every continuous-distribution result can be phrased in terms of either. When a question feels hard via the density, try the CDF, and vice versa — the two viewpoints are equivalent but one is usually much cleaner for a given task. This flexibility is itself a synoptic skill: the ablest candidates choose the representation that minimises the work, rather than mechanically integrating from scratch every time.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.