You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
This lesson extends your understanding of discrete random variables beyond what is covered in A-Level Mathematics. You will learn to compute expectations of functions of random variables, derive variance using the `E(X²)` method, and work with the algebra of expectations. These skills are essential for the rest of the Further Statistics module and underpin many of the distributions you will encounter.
A discrete random variable X takes a countable number of values x1,x2,…, each with an associated probability P(X=xi)=pi. The probability distribution must satisfy:
| Property | Requirement |
|---|---|
| Non-negativity | pi≥0 for all i |
| Normalisation | ∑ipi=1 |
Example: A random variable X has the following distribution:
| x | 1 | 2 | 3 | 4 |
|---|---|---|---|---|
| P(X=x) | 0.1 | 0.3 | 0.4 | 0.2 |
Check: 0.1+0.3+0.4+0.2=1. Valid distribution.
The expected value (or mean) of a discrete random variable is:
E(X)=μ=∑xx⋅P(X=x)
This gives the long-run average value of X if the experiment were repeated many times.
Example: Using the distribution above:
E(X)=1(0.1)+2(0.3)+3(0.4)+4(0.2)=0.1+0.6+1.2+0.8=2.7
Exam Tip: The expected value does not have to be a value that X can actually take. It is a weighted average, not a mode or a median.
If g(X) is any function of the random variable X, then:
E(g(X))=∑xg(x)⋅P(X=x)
This is one of the most important results in Further Statistics. You do not need to find the distribution of g(X) separately — you apply g to each value and weight by the corresponding probability.
Example: Find E(X2) for the distribution above.
E(X2)=12(0.1)+22(0.3)+32(0.4)+42(0.2) =0.1+1.2+3.6+3.2=8.1
Example: Find E(3X+2).
E(3X+2)=∑x(3x+2)⋅P(X=x) =(3⋅1+2)(0.1)+(3⋅2+2)(0.3)+(3⋅3+2)(0.4)+(3⋅4+2)(0.2) =5(0.1)+8(0.3)+11(0.4)+14(0.2)=0.5+2.4+4.4+2.8=10.1
Note that 3E(X)+2=3(2.7)+2=10.1, confirming the linearity of expectation.
The expectation operator is linear:
| Rule | Formula |
|---|---|
| Constant multiple | E(aX)=aE(X) |
| Addition of a constant | E(X+b)=E(X)+b |
| Linear combination | E(aX+b)=aE(X)+b |
| Sum of functions | E(g(X)+h(X))=E(g(X))+E(h(X)) |
However, in general:
E(g(X))=g(E(X))
For example, E(X2)=(E(X))2 in general. The difference between these two quantities is fundamental to the definition of variance.
Exam Tip: A very common error is to assume E(X2)=(E(X))2. This is almost never true. The correct relationship is Var(X)=E(X2)−(E(X))2, which shows that E(X2)≥(E(X))2, with equality only when X is constant.
The variance measures the spread of a distribution:
Var(X)=E((X−μ)2)=∑x(x−μ)2P(X=x)
The computationally easier formula is:
Var(X)=E(X2)−(E(X))2
Example (continued): From above, E(X)=2.7 and E(X2)=8.1.
Var(X)=8.1−2.72=8.1−7.29=0.81
The standard deviation is σ=Var(X)=0.81=0.9.
| Rule | Formula |
|---|---|
| Constant multiple | Var(aX)=a2Var(X) |
| Addition of a constant | Var(X+b)=Var(X) |
| Linear transformation | Var(aX+b)=a2Var(X) |
Note that adding a constant shifts the distribution but does not change its spread, so the variance is unchanged.
Example: If Var(X)=0.81, find Var(3X+2).
Var(3X+2)=32×0.81=9×0.81=7.29
Exam Tip: Remember that variance scales by the square of the constant, while expectation scales linearly. This is the most common source of errors in Further Statistics calculations.
A random variable X has the following distribution, where k is a constant:
| x | 0 | 1 | 2 | 3 |
|---|---|---|---|---|
| P(X=x) | k | 2k | 3k | 4k |
Step 1: Find k.
k+2k+3k+4k=1⟹10k=1⟹k=0.1
Step 2: Find E(X).
E(X)=0(0.1)+1(0.2)+2(0.3)+3(0.4)=0+0.2+0.6+1.2=2.0
Step 3: Find E(X2).
E(X2)=0(0.1)+1(0.2)+4(0.3)+9(0.4)=0+0.2+1.2+3.6=5.0
Step 4: Find Var(X).
Var(X)=5.0−2.02=5.0−4.0=1.0
Step 5: Find E(2X2−3X+1).
E(2X2−3X+1)=2E(X2)−3E(X)+1=2(5.0)−3(2.0)+1=10−6+1=5
The mode is the value of x with the highest probability. A distribution can be multimodal if two or more values share the maximum probability.
The median is the value m such that P(X≤m)≥0.5 and P(X≥m)≥0.5.
Example: For the distribution with k=0.1 above, the mode is x=3 (probability 0.4). The cumulative probabilities are: P(X≤0)=0.1, P(X≤1)=0.3, P(X≤2)=0.6, P(X≤3)=1.0. So the median is m=2.
For a discrete random variable, the cumulative distribution function (CDF) is:
F(x)=P(X≤x)=∑xi≤xP(X=xi)
The CDF is a step function that increases from 0 to 1. To recover the probability mass function (PMF) from the CDF:
P(X=x)=F(x)−F(x−)
where F(x−) is the limit of F from the left.
For two discrete random variables X and Y:
E(X+Y)=E(X)+E(Y)(always)
If X and Y are independent:
E(XY)=E(X)E(Y) Var(X+Y)=Var(X)+Var(Y)
If they are not independent, then:
Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)
where Cov(X,Y)=E(XY)−E(X)E(Y).
Exam Tip: In exam questions, always check whether the variables are stated to be independent. If they are, you can add variances directly. If not, you need the covariance term.
Exam Tip: When finding Var(X), always compute E(X2) and E(X) in a clear table. Show your working for each step — this method is less error-prone than using the definition directly.