You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
The two sorts in the previous lesson are O(n2) — fine for tiny lists but hopeless at scale. This lesson covers the two divide-and-conquer sorts that the OCR H446 specification expects you to know in detail: merge sort and quick sort. Both achieve O(nlogn) time on typical data, which is the practical sweet spot for general-purpose sorting and the reason real-world libraries are built around them. We trace each carefully — merge sort with a divide/merge tree, quick sort with a partition walk — derive their complexities (including quick sort's notorious O(n2) worst case), and weigh their differences in stability, memory and consistency. Recursion underpins both, so this lesson also consolidates the divide-and-conquer thinking you will meet again in the recursion lesson. By the end you should be able to trace either algorithm on a small list, state and justify every entry in its complexity table, and argue convincingly for one or the other given a scenario's constraints — the three things the exam asks of this topic.
This lesson supports the sorting algorithms and divide-and-conquer content of OCR H446 section 2.3, drawing on recursion (2.3) and data structures (1.4). Paraphrased, the assessable points are:
Paraphrased throughout; verify against the current OCR H446 specification.
Both algorithms use the divide-and-conquer strategy, a general problem-solving template:
Binary search (searching lesson) is the simplest divide-and-conquer algorithm — it divides but barely combines. Merge sort and quick sort show the full pattern, and the difference between them is mostly about where the work happens: merge sort divides trivially (just cut in half) and does its real work in the combine (merge) step; quick sort does its real work in the divide (partition) step and combines trivially (the pivot is already in place). Keeping that single contrast in mind — "merge sort works on the way up, quick sort works on the way down" — makes both algorithms much easier to remember and to explain in the exam.
Divide-and-conquer is powerful for three reasons. It often turns a quadratic problem into an O(nlogn) one by replacing a single large task with a logarithmic number of levels of linear work. It is naturally expressed with recursion, since each sub-problem is a smaller instance of the original. And because the sub-problems are independent, it parallelises readily across multiple processor cores. You will see the same template again in the recursion lesson and, less obviously, in algorithms such as binary search and certain graph and numerical methods.
flowchart LR
P[Problem of size n] --> D{Small enough?}
D -- Yes --> B[Solve directly: base case]
D -- No --> S[Divide into sub-problems]
S --> R[Conquer each recursively]
R --> C[Combine sub-solutions]
Merge sort divides the list in half repeatedly until each piece holds a single element (which is trivially sorted), then merges pairs of sorted pieces back together. All the real work is in the merge.
# OCR-style pseudocode
function mergeSort(data)
if data.length <= 1 then
return data // base case
endif
mid = data.length DIV 2
left = mergeSort(data[0..mid-1])
right = mergeSort(data[mid..data.length-1])
return merge(left, right)
endfunction
function merge(left, right)
result = []
i = 0
j = 0
while i < left.length AND j < right.length // compare fronts
if left[i] <= right[j] then // <= keeps it stable
result.append(left[i])
i = i + 1
else
result.append(right[j])
j = j + 1
endif
endwhile
while i < left.length // drain leftovers
result.append(left[i])
i = i + 1
endwhile
while j < right.length
result.append(right[j])
j = j + 1
endwhile
return result
endfunction
def merge_sort(data):
if len(data) <= 1:
return data
mid = len(data) // 2
left = merge_sort(data[:mid])
right = merge_sort(data[mid:])
return merge(left, right)
def merge(left, right):
result = []
i = j = 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
result.extend(left[i:])
result.extend(right[j:])
return result
Divide phase — recursively halve until every chunk has one element:
flowchart TB
A0["6, 3, 8, 1, 5, 2, 7, 4"] --> A1["6, 3, 8, 1"]
A0 --> A2["5, 2, 7, 4"]
A1 --> B1["6, 3"]
A1 --> B2["8, 1"]
A2 --> B3["5, 2"]
A2 --> B4["7, 4"]
B1 --> C1["6"]
B1 --> C2["3"]
B2 --> C3["8"]
B2 --> C4["1"]
B3 --> C5["5"]
B3 --> C6["2"]
B4 --> C7["7"]
B4 --> C8["4"]
Merge phase — repeatedly merge adjacent sorted chunks, choosing the smaller front element each time:
flowchart TB
C1["6"] & C2["3"] --> D1["3, 6"]
C3["8"] & C4["1"] --> D2["1, 8"]
C5["5"] & C6["2"] --> D3["2, 5"]
C7["7"] & C8["4"] --> D4["4, 7"]
D1 & D2 --> E1["1, 3, 6, 8"]
D3 & D4 --> E2["2, 4, 5, 7"]
E1 & E2 --> F["1, 2, 3, 4, 5, 6, 7, 8"]
The merge is the heart of the algorithm, so it is worth tracing in detail. Merging the sorted lists [1, 3, 6, 8] and [2, 4, 5, 7] — compare the fronts, take the smaller, advance that pointer:
| Step | left front (i) | right front (j) | Take | result so far |
|---|---|---|---|---|
| 1 | 1 | 2 | 1 (left) | [1] |
| 2 | 3 | 2 | 2 (right) | [1, 2] |
| 3 | 3 | 4 | 3 (left) | [1, 2, 3] |
| 4 | 6 | 4 | 4 (right) | [1, 2, 3, 4] |
| 5 | 6 | 5 | 5 (right) | [1, 2, 3, 4, 5] |
| 6 | 6 | 7 | 6 (left) | [1, 2, 3, 4, 5, 6] |
| 7 | 8 | 7 | 7 (right) | [1, 2, 3, 4, 5, 6, 7] |
| 8 | 8 | (empty) | 8 (drain) | [1, 2, 3, 4, 5, 6, 7, 8] |
Each merge of two lists totalling k elements costs at most k comparisons — linear in the number of elements merged. That fact drives the complexity below.
| Case | Complexity | Why |
|---|---|---|
| Best | O(nlogn) | It always fully divides and fully merges |
| Average | O(nlogn) | Same — behaviour is input-independent |
| Worst | O(nlogn) | Same — no bad case |
Space complexity is O(n) — the merge step builds new lists, so merge sort needs auxiliary memory proportional to the input size; it is not in place.
Two facts combine. First, repeatedly halving n elements produces log2n levels of division (8 → 4 → 2 → 1 is 3 levels, and log28=3). Second, at each level the total merging work touches all n elements, costing O(n). Multiplying the per-level cost by the number of levels gives:
levels×work per level=log2n×n=O(nlogn)
Crucially this holds for every input — merge sort's great virtue is its guaranteed O(nlogn) with no degenerate case.
It is worth pausing on what the jump from O(n2) to O(nlogn) buys you, because it is the whole reason this lesson exists. The table contrasts the rough operation counts:
| n | n2 (bubble/insertion) | nlog2n (merge/quick) | Speed-up |
|---|---|---|---|
| 1,000 | 1,000,000 | ~10,000 | ~100× |
| 1,000,000 | 1,000,000,000,000 | ~20,000,000 | ~50,000× |
At a million elements the divide-and-conquer sorts do tens of millions of operations where the quadratic sorts do trillions — the difference between "instant" and "your program appears to have hung". This is why no serious system sorts large data with bubble or insertion sort, and why the O(nlogn) algorithms are the default in every standard library. The speed-up is not a fixed factor either: it widens as the data grows, because the gap between n2 and nlogn itself grows with n. That widening gap is the single most important practical consequence of complexity analysis, and it is why choosing the right algorithm matters more than buying faster hardware.
Quick sort picks a pivot, partitions the list so that everything smaller than the pivot is to its left and everything larger to its right (the pivot is now in its final position), then recursively sorts the two partitions. All the work is in the partition; the "combine" is free because the pivot is already correctly placed.
This is the Lomuto partition scheme, the one most commonly taught:
# OCR-style pseudocode
function quickSort(data, low, high)
if low < high then
pivotIndex = partition(data, low, high)
quickSort(data, low, pivotIndex - 1) // left of pivot
quickSort(data, pivotIndex + 1, high) // right of pivot
endif
endfunction
function partition(data, low, high)
pivot = data[high] // last element as pivot
i = low - 1 // boundary of "<= pivot" region
for j = low to high - 1
if data[j] <= pivot then
i = i + 1
swap data[i] and data[j]
endif
next j
swap data[i + 1] and data[high] // put pivot in place
return i + 1
endfunction
def quick_sort(data, low=0, high=None):
if high is None:
high = len(data) - 1
if low < high:
pivot_index = partition(data, low, high)
quick_sort(data, low, pivot_index - 1)
quick_sort(data, pivot_index + 1, high)
return data
def partition(data, low, high):
pivot = data[high]
i = low - 1
for j in range(low, high):
if data[j] <= pivot:
i += 1
data[i], data[j] = data[j], data[i]
data[i + 1], data[high] = data[high], data[i + 1]
return i + 1
i tracks the right-hand end of the "≤ pivot" region; whenever data[j] <= pivot, increment i and swap data[i] with data[j]:
| j | data[j] | ≤5? | i after | Swap | Array |
|---|---|---|---|---|---|
| start | — | — | -1 | — | [6, 3, 8, 1, 5] |
| 0 | 6 | No | -1 | — | [6, 3, 8, 1, 5] |
| 1 | 3 | Yes | 0 | data[0]↔data[1] | [3, 6, 8, 1, 5] |
| 2 | 8 | No | 0 | — | [3, 6, 8, 1, 5] |
| 3 | 1 | Yes | 1 | data[1]↔data[3] | [3, 1, 8, 6, 5] |
Finally place the pivot: swap data[i+1] = data[2] with data[high] = data[4], giving [3, 1, 5, 6, 8]. The pivot 5 is now at index 2 — its final sorted position. Everything to its left ([3, 1]) is ≤5 and everything to its right ([6, 8]) is >5. Quick sort then recurses on [3, 1] (which partitions to [1, 3]) and on [6, 8] (already in order), yielding the fully sorted [1, 3, 5, 6, 8].
The recursion can be drawn as a tree, where each node is a partition and the pivot (bold) is fixed at that step:
flowchart TB
R["6, 3, 8, 1, [5]"] --> L1["3, 1"]
R --> R1["6, 8"]
L1 --> L2["1"]
L1 --> L3["3"]
R1 --> R2["6"]
R1 --> R3["8"]
| Case | Complexity | When |
|---|---|---|
| Best | O(nlogn) | Pivot splits the segment roughly in half each time |
| Average | O(nlogn) | Random data with a reasonable pivot |
| Worst | O(n2) | Pivot is always the smallest or largest element |
Space complexity is O(logn) on average — quick sort partitions in place (only swaps within the array), so the only extra memory is the recursion call stack, which is logn frames deep when the splits are balanced. In the worst case the stack grows to n frames, giving O(n) space.
This is the most examined subtlety in the whole topic. Quick sort is fast because each partition splits the segment roughly in half, giving logn levels — exactly like merge sort. But if the pivot is always the smallest or largest element, the partition is maximally unbalanced: one side has n−1 elements and the other has none. The recursion then has n levels instead of logn, and with O(n) work per level the total is O(n2).
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.