You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Descriptive statistics condense large datasets into simple summaries. They help you understand the shape, centre, and spread of your data before moving on to more complex analyses.
Central tendency describes where the "middle" of a dataset lies.
The mean is the sum of all values divided by the number of values:
x̄ = (x₁ + x₂ + ... + xₙ) / n
Example: For the data set {4, 8, 6, 5, 3, 7, 9}
x̄ = (4 + 8 + 6 + 5 + 3 + 7 + 9) / 7 = 42 / 7 = 6
Note: The mean is sensitive to outliers. A single extreme value can pull the mean far from the typical value.
The median is the middle value when data is sorted in ascending order.
Example: Sorted data {3, 4, 5, 6, 7, 8, 9} → Median = 6
The mode is the value that appears most frequently. A dataset can be:
| Measure | Best For | Limitation |
|---|---|---|
| Mean | Symmetric data without outliers | Sensitive to extreme values |
| Median | Skewed data or data with outliers | Ignores the magnitude of values |
| Mode | Categorical data or identifying peaks | May not be unique or may not exist |
Knowing the centre alone is not enough. You also need to know how spread out the data is.
Range = Maximum − Minimum
Simple but highly affected by outliers.
The IQR measures the spread of the middle 50% of the data:
IQR = Q₃ − Q₁
Where:
The IQR is robust against outliers.
Variance measures the average squared deviation from the mean:
Population variance: σ² = Σ(xᵢ − μ)² / N
Sample variance: s² = Σ(xᵢ − x̄)² / (n − 1)
Why n − 1 for samples? Dividing by (n − 1) instead of n corrects for bias and provides an unbiased estimate of the population variance. This is known as Bessel's correction.
Standard deviation is the square root of the variance. It is in the same units as the original data, making it easier to interpret:
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.