You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Merge sort and quick sort are efficient comparison-based sorting algorithms with average-case time complexity of O(nlogn), making them vastly superior to the O(n2) simple sorts for large datasets. Both rest on the divide-and-conquer paradigm: break the problem into smaller sub-problems, solve those, and combine the results. Merge sort is the second sort named explicitly in the AQA specification (alongside bubble sort), so you must be able to trace it precisely; quick sort is studied for comparison and to illustrate why a faster average case can still hide a dangerous worst case.
This lesson addresses AQA A-Level Computer Science (7517), Section 4.3 Fundamentals of algorithms, specifically 4.3.5 Sorting algorithms, where merge sort is the named O(nlogn) sort. It also draws on 4.3.6 (recursion / divide-and-conquer) for the recursive structure, and on 4.4.4 (classification of algorithms) for the complexity analysis, including the distinction between average-case and worst-case orders that separates the two algorithms.
Both algorithms follow the three-stage divide-and-conquer template:
The crucial difference is where the work happens. Merge sort does trivial work to divide (split in the middle) and the real work to combine (the merge). Quick sort does the real work to divide (the partition) and trivial work to combine (nothing — the parts are already in order). This single design choice explains almost every difference in their behaviour.
Divide and conquer is powerful because it converts a problem of size n into several problems of size n/2, and the halving is what introduces the logarithmic factor that beats the quadratic simple sorts. Recall from the simple-sorts lesson that O(n2) arises from comparing essentially every pair of elements. Divide and conquer's trick is to avoid most of those pairwise comparisons: once the list is split, an element in the left half is never directly compared with most elements in the right half — they are only reconciled in bulk during the merge or implied by the partition. By never doing the wasteful all-pairs work, both algorithms break through the O(n2) ceiling to reach O(nlogn). The same divide-and-conquer pattern recurs across the course — in binary search, in the recursive structure of many tree algorithms, and as a general problem-solving strategy in 4.3.6 — so it is worth recognising as a reusable idea, not just a sorting trick.
Merge sort uses divide and conquer as follows:
A single-element list is, by definition, already sorted — this is the base case that stops the recursion. The genius of merge sort is that combining two already-sorted lists is easy: you never have to search, only to compare the two front elements and take the smaller. So the algorithm pushes all the difficulty down to the trivial base case and recovers a sorted whole purely through repeated easy merges. Read bottom-up, it first sorts pairs, then merges pairs into fours, fours into eights, and so on until the entire list is reassembled in order — which is exactly the level-by-level structure that yields the logn factor.
FUNCTION mergeSort(arr)
IF LENGTH(arr) <= 1 THEN
RETURN arr
END IF
mid = LENGTH(arr) DIV 2
left = mergeSort(arr[0..mid-1])
right = mergeSort(arr[mid..end])
RETURN merge(left, right)
END FUNCTION
FUNCTION merge(left, right)
result = []
i = 0
j = 0
WHILE i < LENGTH(left) AND j < LENGTH(right)
IF left[i] <= right[j] THEN
APPEND left[i] TO result
i = i + 1
ELSE
APPEND right[j] TO result
j = j + 1
END IF
END WHILE
APPEND remaining elements of left to result
APPEND remaining elements of right to result
RETURN result
END FUNCTION
The merge step is the heart of the algorithm. Because both inputs are already sorted, it walks two pointers along them, repeatedly taking the smaller front element — a single linear pass that is O(n) in the combined length.
The same algorithm in Python makes the recursion and the two-pointer merge explicit:
def merge_sort(arr):
if len(arr) <= 1:
return arr # base case: already sorted
mid = len(arr) // 2
left = merge_sort(arr[:mid]) # recurse on left half
right = merge_sort(arr[mid:]) # recurse on right half
return merge(left, right)
def merge(left, right):
result, i, j = [], 0, 0
while i < len(left) and j < len(right):
if left[i] <= right[j]: # <= keeps the sort stable
result.append(left[i]); i += 1
else:
result.append(right[j]); j += 1
result.extend(left[i:]) # append whichever side remains
result.extend(right[j:])
return result
The use of <= rather than < in the comparison is what makes merge sort stable: when two elements are equal, the one from the left (earlier) sub-list is taken first, preserving the original relative order. Switching it to < would break stability without affecting correctness — a subtle but examinable detail.
Sorting [38, 27, 43, 3, 9, 82, 10]:
graph TD
N0["38, 27, 43, 3, 9, 82, 10"] --> N1["38, 27, 43"]
N0 --> N2["3, 9, 82, 10"]
N1 --> N3["38"]
N1 --> N4["27, 43"]
N2 --> N5["3, 9"]
N2 --> N6["82, 10"]
N4 --> N7["27"]
N4 --> N8["43"]
N5 --> N9["3"]
N5 --> N10["9"]
N6 --> N11["82"]
N6 --> N12["10"]
Sub-lists are merged back up the tree, each merge producing a sorted list:
| Merge inputs | Result |
|---|---|
| [27] and [43] | [27, 43] |
| [38] and [27, 43] | [27, 38, 43] |
| [3] and [9] | [3, 9] |
| [82] and [10] | [10, 82] |
| [3, 9] and [10, 82] | [3, 9, 10, 82] |
| [27, 38, 43] and [3, 9, 10, 82] | [3, 9, 10, 27, 38, 43, 82] |
To make the merge mechanism explicit, here is the final merge of [27, 38, 43] with [3, 9, 10, 82], tracing the two pointers i (left) and j (right):
| Compare | Smaller taken | result so far |
|---|---|---|
| 27 vs 3 | 3 (right) | [3] |
| 27 vs 9 | 9 (right) | [3, 9] |
| 27 vs 10 | 10 (right) | [3, 9, 10] |
| 27 vs 82 | 27 (left) | [3, 9, 10, 27] |
| 38 vs 82 | 38 (left) | [3, 9, 10, 27, 38] |
| 43 vs 82 | 43 (left) | [3, 9, 10, 27, 38, 43] |
| left exhausted | append 82 | [3, 9, 10, 27, 38, 43, 82] |
| Case | Time | Space |
|---|---|---|
| Best | O(nlogn) | O(n) |
| Worst | O(nlogn) | O(n) |
| Average | O(nlogn) | O(n) |
Why O(nlogn)? The splitting halves the list log2n times, so the recursion tree has log2n levels. At every level, the merges across all sub-lists touch each of the n elements exactly once, costing O(n) per level. Multiplying levels by per-level cost gives O(n)×O(logn)=O(nlogn). Because this reasoning does not depend on the input order, merge sort's best, worst and average cases are all O(nlogn) — it is, in fact, Θ(nlogn).
Why O(n) space? The merge step builds new arrays rather than sorting in place; the total extra storage in use at any one time is proportional to n.
The complexity is easiest to see by thinking about the recursion as a tree of levels:
graph TD
L0["Level 0: 1 list of size n — merge cost n"] --> L1["Level 1: 2 lists of size n/2 — total merge cost n"]
L1 --> L2["Level 2: 4 lists of size n/4 — total merge cost n"]
L2 --> L3["... continues for log n levels ..."]
L3 --> L4["Bottom: n lists of size 1 — base case"]
There are two independent facts to combine. Vertically, the list halves at each level, so the tree has 1+log2n levels — this is where the logn factor comes from. Horizontally, at every single level the merges collectively touch all n elements exactly once, so each level costs O(n) regardless of how many sub-lists it is split across (twice as many lists, each half the size, still sums to n). Multiplying the per-level cost by the number of levels gives the total: n×log2n=O(nlogn). Because this argument never assumed anything about the input order, merge sort's best, worst and average cases are identical — a guarantee no other sort on the specification offers.
Quick sort also uses divide and conquer, but instead of splitting blindly in the middle it chooses a pivot and partitions the list so that:
After partitioning, the pivot is in its final sorted position. The algorithm then recursively sorts the sub-lists either side of it. No combine step is needed — once both sides are sorted, the whole list is sorted.
The single most important fact about a partition is that, once complete, the pivot will never move again: everything smaller is to its left and everything larger to its right, so its position is correct no matter how the two sides are later rearranged. Each partition therefore permanently places exactly one element. When the pivots happen to fall near the median each time, the list is roughly halved at every level and the recursion is about log2n deep — the good case. When the pivots are consistently extreme (the smallest or largest element), the split is maximally lopsided and the recursion degenerates to n levels — the bad case traced below. Everything about quick sort's performance flows from how balanced the partitions turn out to be, which is why pivot choice is so consequential.
FUNCTION quickSort(arr, low, high)
IF low < high THEN
pivotIndex = partition(arr, low, high)
quickSort(arr, low, pivotIndex - 1)
quickSort(arr, pivotIndex + 1, high)
END IF
END FUNCTION
FUNCTION partition(arr, low, high)
pivot = arr[high]
i = low - 1
FOR j = low TO high - 1
IF arr[j] <= pivot THEN
i = i + 1
SWAP arr[i], arr[j]
END IF
END FOR
SWAP arr[i + 1], arr[high]
RETURN i + 1
END FUNCTION
Sorting [10, 7, 8, 9, 1, 5] with the last element as pivot. The first partition (pivot = 5) is shown in full so the pointer mechanism is visible. i tracks the boundary of the "≤ pivot" region:
| j | arr[j] | arr[j] ≤ 5? | Action | Array (boundary i) |
|---|---|---|---|---|
| 0 | 10 | No | — | [10, 7, 8, 9, 1, 5], i = -1 |
| 1 | 7 | No | — | [10, 7, 8, 9, 1, 5], i = -1 |
| 2 | 8 | No | — | [10, 7, 8, 9, 1, 5], i = -1 |
| 3 | 9 | No | — | [10, 7, 8, 9, 1, 5], i = -1 |
| 4 | 1 | Yes | i=0; swap arr[0]↔arr[4] | [1, 7, 8, 9, 10, 5], i = 0 |
| end | — | — | swap arr[1]↔arr[5] (pivot) | [1, 5, 8, 9, 10, 7] |
The pivot 5 lands at index 1 — its final position. Everything to its left (just [1]) is smaller; everything to its right is larger. Recursion then continues on the two sides:
| Sub-list | Pivot | Result of partition |
|---|---|---|
| [8, 9, 10, 7] (indices 2–5) | 7 | 7 placed → [7] [8, 9, 10] |
| [8, 9, 10] | 10 | 10 placed → [8, 9] [10] |
| [8, 9] | 9 | 9 placed → [8] [9] |
Final sorted array: [1, 5, 7, 8, 9, 10].
| Case | Time | Space |
|---|---|---|
| Best | O(nlogn) | O(logn) |
| Worst | O(n2) | O(n) |
| Average | O(nlogn) | O(logn) |
Why O(n2) in the worst case? If the pivot is always the smallest or largest element — which happens with already-sorted (or reverse-sorted) data when the last element is chosen as pivot — then one partition is empty and the other holds all the remaining n−1 elements. The recursion depth becomes n rather than logn, and each level still does an O(n) partition, giving n×n=O(n2). Randomising the pivot, or using a median-of-three rule, makes this worst case vanishingly unlikely in practice.
Why O(logn) space? Quick sort partitions in place (no new arrays), so the only extra memory is the recursion stack, whose depth is O(logn) on average — though it degrades to O(n) in the unbalanced worst case.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.