Moment Generating Functions

The moment generating function (MGF) is the most elegant idea in the 7367/3S option. A single function $M_X(t)$ encodes every moment of a distribution at once: differentiate it and set $t = 0$ to read off $E(X)$ , $E(X^2)$ , $E(X^3)$ , … in turn. Better still, two structural theorems — uniqueness (the MGF determines the distribution) and multiplicativity (the MGF of an independent sum is the product of MGFs) — turn otherwise hard "prove the sum is also Poisson/normal" questions into a few clean lines. This lesson derives the standard MGFs from scratch, with full mark schemes, and shows all three exam uses.

1. Where this sits in AQA 7367

This is Paper 3 Statistics option (7367/3S) content (per-paper weighting AO1 40% / AO2 25% / AO3 35%). Deriving an MGF and differentiating it for moments is AO1; the uniqueness and product theorems are reasoning/proof tools (AO2 — proving a sum is Poisson is an AO2 "show that"); applying MGFs to an unfamiliar combination of variables is AO3. The prerequisites are the discrete and continuous expectation definitions (Lessons 1, 5, 7), the Maclaurin series for $e^x$ (A-Level Maths / Further Pure), the product and chain rules, and the geometric/exponential series.

The moment generating function is the conceptual high point of the statistics option, and it is examined precisely because it ties so many earlier skills together: series expansions, differentiation, the named distributions and their parameters, and the algebra of independent sums. A student who is fluent with MGFs can find a mean and variance in two derivatives, prove a distributional result in three lines, and identify an unknown distribution by pattern-matching — feats that would otherwise demand laborious summation or integration. Because the technique is so powerful, examiners reward not just the mechanics but the understanding: knowing why $M_X(0) = 1$ , why independent sums multiply, and why the uniqueness theorem licenses an identification. This lesson builds all three from first principles.

2. Core theory

Definition

The moment generating function of a random variable $X$ is

$M_X(t) = E\big(e^{tX}\big),$

a function of the auxiliary real variable $t$ . Concretely,

$\text{discrete: } M_X(t) = \sum_x e^{tx}\,P(X = x), \qquad \text{continuous: } M_X(t) = \int_{-\infty}^{\infty} e^{tx} f(x)\,dx.$

The MGF exists (is finite) on some open interval of $t$ -values containing $0$ ; note that $M_X(0) = E(e^0) = E(1) = 1$ always — a useful instant check. If a derived MGF does not give 1 at $t = 0$ , you have made an algebra error: this single substitution catches a surprising number of slips before they propagate into the moments. The auxiliary variable $t$ has no probabilistic meaning of its own — it is purely a book-keeping device whose powers tag the successive moments. Think of $M_X(t)$ as a "moment dispenser": feed in derivatives and the value $t = 0$ , and out come $E(X)$ , $E(X^2)$ , $E(X^3)$ , and so on in order.

Why "moment generating"? — the Maclaurin argument

Expand $e^{tX}$ as a power series in $t$ :

$e^{tX} = 1 + tX + \frac{(tX)^2}{2!} + \frac{(tX)^3}{3!} + \cdots = \sum_{n=0}^{\infty} \frac{t^n X^n}{n!}.$

Take expectations term by term:

$M_X(t) = 1 + tE(X) + \frac{t^2}{2!}E(X^2) + \frac{t^3}{3!}E(X^3) + \cdots = \sum_{n=0}^{\infty} \frac{t^n}{n!}E(X^n).$

The coefficient of $\tfrac{t^n}{n!}$ is the $n$ -th moment $E(X^n)$ . Equivalently, differentiating $n$ times and setting $t = 0$ peels off that coefficient:

$\boxed{\,E(X^n) = M_X^{(n)}(0)\,}$

Derivative at $t=0$	Gives	Meaning
$M_X'(0)$	$E(X)$	mean
$M_X''(0)$	$E(X^2)$	second moment
$M_X'''(0)$	$E(X^3)$	third moment

and hence

$\text{Var}(X) = M_X''(0) - \big(M_X'(0)\big)^2.$

Two structural theorems

Linear transformation: if $Y = aX + b$ then $M_Y(t) = E\big(e^{t(aX+b)}\big) = e^{bt}E\big(e^{(at)X}\big) = e^{bt}M_X(at)$ .
Independent sums (multiplicativity): if $X, Y$ are independent then $e^{tX}$ and $e^{tY}$ are independent, so $M_{X+Y}(t) = E\big(e^{t(X+Y)}\big) = E(e^{tX})E(e^{tY}) = M_X(t)\,M_Y(t)$ .
Uniqueness: if two variables have the same MGF on an interval around $0$ , they have the same distribution. This is what lets you identify a distribution from its MGF.

The uniqueness theorem deserves emphasis because it is what makes the MGF a genuine identifier rather than a mere moment-calculator. Two distributions can share a mean and variance yet be utterly different (a uniform and a normal can both have mean 0 and variance 1, for instance), so matching the first two moments proves nothing. But matching the entire MGF — equivalently, all the moments simultaneously — pins the distribution down completely. That is why an exam question can hand you an unfamiliar MGF such as $(1 - 2t)^{-3}$ and ask "what is the distribution?": you compare it against the known forms, and uniqueness guarantees that a match is conclusive. The product rule for independent sums and the uniqueness theorem work as a team: the product gives you the MGF of a sum, and uniqueness lets you name the resulting distribution. This pairing is the single most powerful technique in the whole option, turning otherwise formidable "prove the sum is also …" questions into a short algebraic recognition.

3. Worked examples with M1/A1 mark scheme

Example 1 — Poisson MGF and its moments

Let $X \sim \text{Po}(\lambda)$ . Then

$M_X(t) = \sum_{r=0}^{\infty} e^{tr}\,\frac{e^{-\lambda}\lambda^r}{r!} = e^{-\lambda}\sum_{r=0}^{\infty}\frac{(\lambda e^t)^r}{r!} = e^{-\lambda}\,e^{\lambda e^t} = e^{\lambda(e^t - 1)}. \quad (\textbf{M1}\ \text{factor }e^{-\lambda};\ \textbf{M1}\ \text{recognise }e^x\text{ series};\ \textbf{A1})$

Mean (product/chain rule):

$M_X'(t) = \lambda e^t\,e^{\lambda(e^t - 1)} \;\Rightarrow\; M_X'(0) = \lambda\cdot 1\cdot e^{0} = \lambda = E(X). \quad (\textbf{M1 A1})$

Second moment (differentiate the product again):

$M_X''(t) = \big(\lambda e^t + \lambda^2 e^{2t}\big)e^{\lambda(e^t - 1)} \;\Rightarrow\; M_X''(0) = \lambda + \lambda^2 = E(X^2). \quad (\textbf{M1 A1})$

Variance: $\text{Var}(X) = (\lambda + \lambda^2) - \lambda^2 = \lambda$ (mean = variance, as it must be for a Poisson). $(\textbf{A1})$

Example 2 — Exponential MGF and its moments

Let $X \sim \text{Exp}(\lambda)$ , $f(x) = \lambda e^{-\lambda x}$ for $x \ge 0$ . For $t < \lambda$ ,

$M_X(t) = \int_0^{\infty} e^{tx}\lambda e^{-\lambda x}\,dx = \lambda\int_0^{\infty} e^{-(\lambda - t)x}\,dx = \lambda\!\left[\frac{-e^{-(\lambda-t)x}}{\lambda - t}\right]_0^{\infty} = \frac{\lambda}{\lambda - t}. \quad (\textbf{M1}\ \text{combine exponents};\ \textbf{A1};\ \textbf{B1}\ \text{state }t<\lambda)$

The condition $t < \lambda$ is essential: only then does $e^{-(\lambda - t)x} \to 0$ at infinity. Differentiating,

$M_X'(t) = \frac{\lambda}{(\lambda - t)^2} \Rightarrow M_X'(0) = \frac{1}{\lambda}, \qquad M_X''(t) = \frac{2\lambda}{(\lambda - t)^3} \Rightarrow M_X''(0) = \frac{2}{\lambda^2},$

so $E(X) = \tfrac{1}{\lambda}$ , $\text{Var}(X) = \tfrac{2}{\lambda^2} - \tfrac{1}{\lambda^2} = \tfrac{1}{\lambda^2}$ . $(\textbf{M1 A1 A1})$ — matching Lesson 7 exactly, with no integration of $x^2 f$ .

Example 3 — sum of independent Poissons (uniqueness in action)

Let $X \sim \text{Po}(\lambda_1)$ , $Y \sim \text{Po}(\lambda_2)$ be independent. Then

$M_{X+Y}(t) = M_X(t)M_Y(t) = e^{\lambda_1(e^t-1)}\,e^{\lambda_2(e^t-1)} = e^{(\lambda_1+\lambda_2)(e^t-1)}. \quad (\textbf{M1}\ \text{product};\ \textbf{A1})$

This is exactly the MGF of $\text{Po}(\lambda_1 + \lambda_2)$ ; by uniqueness, $X + Y \sim \text{Po}(\lambda_1 + \lambda_2)$ . $(\textbf{A1}\ \text{conclude with named theorem})$

Example 4 — binomial MGF via the binomial theorem

Let $X \sim B(n, p)$ , so $P(X = r) = \binom{n}{r}p^r(1-p)^{n-r}$ . Then

$M_X(t) = \sum_{r=0}^{n} e^{tr}\binom{n}{r}p^r(1-p)^{n-r} = \sum_{r=0}^{n}\binom{n}{r}(pe^t)^r(1-p)^{n-r} = \big(pe^t + 1 - p\big)^n, \quad (\textbf{M1}\ \text{group as }(pe^t)^r;\ \textbf{M1}\ \text{binomial theorem};\ \textbf{A1})$

recognising the sum as the binomial expansion of $(pe^t + (1-p))^n$ . Differentiating once,

$M_X'(t) = n\big(1 - p + pe^t\big)^{n-1}\cdot pe^t \;\Rightarrow\; M_X'(0) = n\cdot 1^{\,n-1}\cdot p = np = E(X), \quad (\textbf{M1 A1})$

the familiar binomial mean — derived in two lines rather than by the $\sum rP(X=r)$ calculation. This example is a template for every "derive the MGF and hence the mean" question: identify the probability function, factor the $e^{tr}$ into the existing structure, recognise a standard series (here the binomial theorem; for the Poisson, the exponential series), then differentiate.

4. Specimen-style exam question

(Specimen-style — not from any real paper.)

A discrete random variable $X$ has moment generating function $M_X(t) = \big(\tfrac14 + \tfrac34 e^t\big)^{2}$ . (a) Find $M_X'(t)$ and hence $E(X)$ . (b) Find $\text{Var}(X)$ . (c) Identify the distribution of $X$ , giving its parameters.

(a) It is cleanest to expand first: $M_X(t) = \big(\tfrac14 + \tfrac34 e^t\big)^2 = \tfrac{1}{16} + \tfrac{6}{16}e^t + \tfrac{9}{16}e^{2t} = \tfrac{1}{16} + \tfrac{3}{8}e^t + \tfrac{9}{16}e^{2t}$ . Then

$M_X'(t) = \tfrac38 e^t + \tfrac98 e^{2t}, \qquad M_X'(0) = \tfrac38 + \tfrac98 = \tfrac{12}{8} = \tfrac32.$

Hence $E(X) = \tfrac32$ .

(b) Differentiating again, $M_X''(t) = \tfrac38 e^t + \tfrac94 e^{2t}$ , so $M_X''(0) = \tfrac38 + \tfrac94 = \tfrac{3 + 18}{8} = \tfrac{21}{8}$ . Thus $E(X^2) = \tfrac{21}{8}$ and

$\text{Var}(X) = \tfrac{21}{8} - \big(\tfrac32\big)^2 = \tfrac{21}{8} - \tfrac{18}{8} = \tfrac{3}{8}.$

(c) The MGF has the binomial form $(1 - p + pe^t)^n$ with $n = 2$ and $p = \tfrac34$ (since $1 - p = \tfrac14$ ), so $X \sim B\big(2, \tfrac34\big)$ . Check: $E(X) = np = 2\cdot\tfrac34 = \tfrac32$ ✓ and $\text{Var}(X) = np(1-p) = 2\cdot\tfrac34\cdot\tfrac14 = \tfrac{3}{8}$ ✓.

5. Synoptic links

Lessons 1–3 (Poisson/binomial): the MGFs $e^{\lambda(e^t-1)}$ and $(1-p+pe^t)^n$ reproduce the means and variances you already know, and the additivity of independent Poissons (Lesson 2) drops out of multiplicativity in one line — a result that previously required a more laborious convolution argument now follows almost immediately.
Lesson 7 (exponential): the MGF route to $E(X) = 1/\lambda$ , $\text{Var} = 1/\lambda^2$ avoids two integrations by parts.
A-Level Maths / Further Pure: the Maclaurin series for $e^x$ , the product/chain rules, and the geometric series are the engine; the MGF is essentially a generating function, linking to the sequences-and-series strand. The same generating-function idea recurs across mathematics — ordinary and exponential generating functions encode integer sequences in combinatorics, and probability generating functions $G_X(z) = E(z^X)$ (closely related to the MGF via $z = e^t$ ) are the natural tool for non-negative integer-valued variables. Recognising the MGF as one member of this family deepens the connection to the wider Further Pure content.
Beyond the spec: the MGF is the bridge to the Central Limit Theorem (sums of standardised variables have MGFs converging to $e^{t^2/2}$ , the standard-normal MGF).

The deeper message is that the MGF is a unifying object. Each distribution you have met separately — Bernoulli, binomial, Poisson, exponential, normal — has a characteristic MGF, and the structural operations on random variables correspond to clean operations on those MGFs: adding independent variables multiplies their MGFs, scaling-and-shifting transforms $M_X(t)$ into $e^{bt}M_X(at)$ , and standardising a normal collapses its MGF to the universal $e^{t^2/2}$ . Seeing these correspondences turns a collection of distribution-specific facts into a single calculus of distributions, which is exactly the kind of synoptic mastery the Further Maths qualification is designed to develop and the option paper to test.

Moment Generating Functions

Moment Generating Functions

1. Where this sits in AQA 7367

2. Core theory

Definition

Why "moment generating"? — the Maclaurin argument

Two structural theorems

3. Worked examples with M1/A1 mark scheme

Example 1 — Poisson MGF and its moments

Example 2 — Exponential MGF and its moments

Example 3 — sum of independent Poissons (uniqueness in action)

Example 4 — binomial MGF via the binomial theorem

4. Specimen-style exam question

5. Synoptic links

6. Mark-scheme literacy

More in Mathematics