Discrete Random Variables (Further)

This lesson extends your understanding of discrete random variables beyond what is covered in A-Level Mathematics. You will learn to compute expectations of functions of random variables, derive variance using the `E(X²)` method, and work with the algebra of expectations. These skills are essential for the rest of the Further Statistics module and underpin many of the distributions you will encounter.

Recap: Probability Distributions

A discrete random variable $X$ takes a countable number of values $x_1, x_2, \ldots$ , each with an associated probability $P(X = x_i) = p_i$ . The probability distribution must satisfy:

Property	Requirement
Non-negativity	$p_i \geq 0$ for all $i$
Normalisation	$\sum_i p_i = 1$

Example: A random variable $X$ has the following distribution:

$x$	1	2	3	4
$P(X = x)$	0.1	0.3	0.4	0.2

Check: $0.1 + 0.3 + 0.4 + 0.2 = 1$ . Valid distribution.

Expectation $E(X)$

The expected value (or mean) of a discrete random variable is:

$E(X) = \mu = \sum_x x \cdot P(X = x)$

This gives the long-run average value of $X$ if the experiment were repeated many times.

Example: Using the distribution above:

$E(X) = 1(0.1) + 2(0.3) + 3(0.4) + 4(0.2) = 0.1 + 0.6 + 1.2 + 0.8 = 2.7$

Exam Tip: The expected value does not have to be a value that $X$ can actually take. It is a weighted average, not a mode or a median.

Expectation of a Function of $X$ : $E(g(X))$

If $g(X)$ is any function of the random variable $X$ , then:

$E(g(X)) = \sum_x g(x) \cdot P(X = x)$

This is one of the most important results in Further Statistics. You do not need to find the distribution of $g(X)$ separately — you apply $g$ to each value and weight by the corresponding probability.

Example: Find $E(X^2)$ for the distribution above.

$E(X^2) = 1^2(0.1) + 2^2(0.3) + 3^2(0.4) + 4^2(0.2)$ $= 0.1 + 1.2 + 3.6 + 3.2 = 8.1$

Example: Find $E(3X + 2)$ .

$E(3X + 2) = \sum_x (3x + 2) \cdot P(X = x)$ $= (3 \cdot 1 + 2)(0.1) + (3 \cdot 2 + 2)(0.3) + (3 \cdot 3 + 2)(0.4) + (3 \cdot 4 + 2)(0.2)$ $= 5(0.1) + 8(0.3) + 11(0.4) + 14(0.2) = 0.5 + 2.4 + 4.4 + 2.8 = 10.1$

Note that $3E(X) + 2 = 3(2.7) + 2 = 10.1$ , confirming the linearity of expectation.

Properties of Expectation

The expectation operator is linear:

Rule	Formula
Constant multiple	$E(aX) = aE(X)$
Addition of a constant	$E(X + b) = E(X) + b$
Linear combination	$E(aX + b) = aE(X) + b$
Sum of functions	$E(g(X) + h(X)) = E(g(X)) + E(h(X))$

However, in general:

$E(g(X)) \neq g(E(X))$

For example, $E(X^2) \neq (E(X))^2$ in general. The difference between these two quantities is fundamental to the definition of variance.

Exam Tip: A very common error is to assume $E(X^2) = (E(X))^2$ . This is almost never true. The correct relationship is $\text{Var}(X) = E(X^2) - (E(X))^2$ , which shows that $E(X^2) \geq (E(X))^2$ , with equality only when $X$ is constant.

Variance $\text{Var}(X)$

The variance measures the spread of a distribution:

$\text{Var}(X) = E((X - \mu)^2) = \sum_x (x - \mu)^2 P(X = x)$

The computationally easier formula is:

$\text{Var}(X) = E(X^2) - (E(X))^2$

Example (continued): From above, $E(X) = 2.7$ and $E(X^2) = 8.1$ .

$\text{Var}(X) = 8.1 - 2.7^2 = 8.1 - 7.29 = 0.81$

The standard deviation is $\sigma = \sqrt{\text{Var}(X)} = \sqrt{0.81} = 0.9$ .

Properties of Variance

Rule	Formula
Constant multiple	$\text{Var}(aX) = a^2 \text{Var}(X)$
Addition of a constant	$\text{Var}(X + b) = \text{Var}(X)$
Linear transformation	$\text{Var}(aX + b) = a^2 \text{Var}(X)$

Note that adding a constant shifts the distribution but does not change its spread, so the variance is unchanged.

Example: If $\text{Var}(X) = 0.81$ , find $\text{Var}(3X + 2)$ .

$\text{Var}(3X + 2) = 3^2 \times 0.81 = 9 \times 0.81 = 7.29$

Exam Tip: Remember that variance scales by the square of the constant, while expectation scales linearly. This is the most common source of errors in Further Statistics calculations.

Worked Example: Finding an Unknown Parameter

A random variable $X$ has the following distribution, where $k$ is a constant:

$x$	0	1	2	3
$P(X = x)$	$k$	$2k$	$3k$	$4k$

Step 1: Find $k$ .

$k + 2k + 3k + 4k = 1 \implies 10k = 1 \implies k = 0.1$

Step 2: Find $E(X)$ .

$E(X) = 0(0.1) + 1(0.2) + 2(0.3) + 3(0.4) = 0 + 0.2 + 0.6 + 1.2 = 2.0$

Step 3: Find $E(X^2)$ .

$E(X^2) = 0(0.1) + 1(0.2) + 4(0.3) + 9(0.4) = 0 + 0.2 + 1.2 + 3.6 = 5.0$

Step 4: Find $\text{Var}(X)$ .

$\text{Var}(X) = 5.0 - 2.0^2 = 5.0 - 4.0 = 1.0$

Step 5: Find $E(2X^2 - 3X + 1)$ .

$E(2X^2 - 3X + 1) = 2E(X^2) - 3E(X) + 1 = 2(5.0) - 3(2.0) + 1 = 10 - 6 + 1 = 5$

Mode and Median of a Discrete Distribution

The mode is the value of $x$ with the highest probability. A distribution can be multimodal if two or more values share the maximum probability.

The median is the value $m$ such that $P(X \leq m) \geq 0.5$ and $P(X \geq m) \geq 0.5$ .

Example: For the distribution with $k = 0.1$ above, the mode is $x = 3$ (probability 0.4). The cumulative probabilities are: $P(X \leq 0) = 0.1$ , $P(X \leq 1) = 0.3$ , $P(X \leq 2) = 0.6$ , $P(X \leq 3) = 1.0$ . So the median is $m = 2$ .

The Cumulative Distribution Function (Discrete)

For a discrete random variable, the cumulative distribution function (CDF) is:

$F(x) = P(X \leq x) = \sum_{x_i \leq x} P(X = x_i)$

The CDF is a step function that increases from 0 to 1. To recover the probability mass function (PMF) from the CDF:

$P(X = x) = F(x) - F(x^{-})$

where $F(x^{-})$ is the limit of $F$ from the left.

Covariance and Independence (Introduction)

For two discrete random variables $X$ and $Y$ :

$E(X + Y) = E(X) + E(Y) \quad \text{(always)}$

If $X$ and $Y$ are independent:

$E(XY) = E(X)E(Y)$ $\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y)$

If they are not independent, then:

$\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) + 2\text{Cov}(X, Y)$

where $\text{Cov}(X, Y) = E(XY) - E(X)E(Y)$ .

Exam Tip: In exam questions, always check whether the variables are stated to be independent. If they are, you can add variances directly. If not, you need the covariance term.

Summary

$E(X) = \sum x \cdot P(X = x)$ — the weighted average of all possible values.
$E(g(X)) = \sum g(x) \cdot P(X = x)$ — apply the function first, then weight by probabilities.
$E(aX + b) = aE(X) + b$ — expectation is linear.
$\text{Var}(X) = E(X^2) - (E(X))^2$ — the computational formula for variance.
$\text{Var}(aX + b) = a^2 \text{Var}(X)$ — variance scales by the square of the constant.
For independent random variables: $\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y)$ .

Exam Tip: When finding $\text{Var}(X)$ , always compute $E(X^2)$ and $E(X)$ in a clear table. Show your working for each step — this method is less error-prone than using the definition directly.

Discrete Random Variables (Further)

Discrete Random Variables (Further)

Recap: Probability Distributions

Expectation E(X)E(X)E(X)

Expectation of a Function of XXX: E(g(X))E(g(X))E(g(X))

Properties of Expectation

Variance Var(X)\text{Var}(X)Var(X)

Properties of Variance

Worked Example: Finding an Unknown Parameter

Mode and Median of a Discrete Distribution

The Cumulative Distribution Function (Discrete)

Covariance and Independence (Introduction)

Summary

More in Mathematics

Expectation $E(X)$

Expectation of a Function of $X$ : $E(g(X))$

Variance $\text{Var}(X)$