You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Sorting — arranging data into a defined order — is, alongside searching, one of the two great staple operations of computing, and the OCR H446 specification expects you to know four sorting algorithms in detail. This lesson covers the two simplest: bubble sort and insertion sort. Both are O(n2) "quadratic" sorts, which makes them poor choices for large data, yet they remain essential to learn: they are easy to trace by hand, they illustrate ideas (in-place sorting, stability, adaptivity, best/worst/average analysis) that recur throughout the course, and the exam routinely asks you to show the state of an array after each pass. We trace each algorithm pass by pass, derive its complexity in every case, and finish with the judgement question the exam loves: which algorithm, and why. Treat the tracing as a skill to be practised rather than a fact to be memorised — the marks come from showing the array's evolving state correctly, step by step, exactly as you would on a real exam paper.
This lesson supports the sorting algorithms content of OCR H446 section 2.3 (algorithm design and analysis), with links to 1.4 data structures and to the complexity material of 2.3. Paraphrased, the assessable points are:
We paraphrase throughout; verify exact wording against the current OCR H446 specification.
Putting data into ascending or descending order is rarely the end goal — it is the enabler for many faster operations:
Because sorting is so foundational, computer scientists have studied it exhaustively — there are dozens of named sorting algorithms, each making a different trade-off between time, space, stability and simplicity. The two algorithms in this lesson are the entry point to that study: they are the simplest to understand and to trace, which makes them ideal for learning the vocabulary of sorting (passes, comparisons, swaps, in-place, stable, adaptive, best/worst/average case) that you will reuse when you meet the faster algorithms next. Master the vocabulary here and the more advanced sorts become much easier to reason about.
Bubble sort repeatedly steps through the list comparing adjacent elements and swapping them when they are out of order. On each pass the largest unsorted element "bubbles" to its correct place at the end — hence the name.
The swapped flag is what makes this version adaptive — it lets the algorithm finish early on already-sorted (or nearly sorted) data:
# OCR-style pseudocode
procedure bubbleSort(data, size)
for i = 0 to size - 2
swapped = false
for j = 0 to size - 2 - i // shrinking inner range
if data[j] > data[j + 1] then
temp = data[j]
data[j] = data[j + 1]
data[j + 1] = temp
swapped = true
endif
next j
if swapped == false then
return // already sorted — exit early
endif
next i
endprocedure
def bubble_sort(data):
n = len(data)
for i in range(n - 1):
swapped = False
for j in range(n - 1 - i):
if data[j] > data[j + 1]:
data[j], data[j + 1] = data[j + 1], data[j]
swapped = True
if not swapped:
break # already sorted
return data
Pass 1 (inner loop compares indices 0–3; the bold value reaches its final home):
| Compare | Values | Swap? | Array after |
|---|---|---|---|
| data[0] vs data[1] | 5 > 3 | Yes | [3, 5, 8, 1, 2] |
| data[1] vs data[2] | 5 < 8 | No | [3, 5, 8, 1, 2] |
| data[2] vs data[3] | 8 > 1 | Yes | [3, 5, 1, 8, 2] |
| data[3] vs data[4] | 8 > 2 | Yes | [3, 5, 1, 2, 8] |
Pass 2 (inner loop now only needs indices 0–2):
| Compare | Values | Swap? | Array after |
|---|---|---|---|
| data[0] vs data[1] | 3 < 5 | No | [3, 5, 1, 2, 8] |
| data[1] vs data[2] | 5 > 1 | Yes | [3, 1, 5, 2, 8] |
| data[2] vs data[3] | 5 > 2 | Yes | [3, 1, 2, 5, 8] |
Pass 3 (indices 0–1):
| Compare | Values | Swap? | Array after |
|---|---|---|---|
| data[0] vs data[1] | 3 > 1 | Yes | [1, 3, 2, 5, 8] |
| data[1] vs data[2] | 3 > 2 | Yes | [1, 2, 3, 5, 8] |
Pass 4 (index 0 only): data[0]=1 < data[1]=2 — no swap. Because no swap occurred in this pass, swapped stays false and the algorithm terminates early. The fully sorted array is [1, 2, 3, 5, 8].
Exam Tip: When asked to show bubble sort "after each pass", give the whole array at the end of every pass and mark which element is now fixed. Examiners award marks per correct pass, so a single mis-traced swap costs only one mark if the rest follows correctly from your (wrong) state.
To see why the swap flag matters, trace bubble sort on the nearly sorted array [1, 2, 4, 3, 5], where only one pair is out of place:
Pass 1:
| Compare | Values | Swap? | Array after |
|---|---|---|---|
| data[0] vs data[1] | 1 < 2 | No | [1, 2, 4, 3, 5] |
| data[1] vs data[2] | 2 < 4 | No | [1, 2, 4, 3, 5] |
| data[2] vs data[3] | 4 > 3 | Yes | [1, 2, 3, 4, 5] |
| data[3] vs data[4] | 4 < 5 | No | [1, 2, 3, 4, 5] |
Pass 2: the inner loop makes three comparisons (1<2, 2<3, 3<4) and no swaps, so swapped remains false and the algorithm exits. The list was sorted after a single corrective swap. Without the flag, bubble sort would blindly grind through all four passes regardless — wasting roughly O(n2) work on data that was almost ready. This is exactly why the best case is O(n) only when the flag is present, and it is the most common thing candidates forget to mention.
| Case | When | Comparisons | Complexity |
|---|---|---|---|
| Best | Already sorted (early exit) | n−1 | O(n) |
| Average | Random order | about 4n(n−1) | O(n2) |
| Worst | Reverse sorted | 2n(n−1) | O(n2) |
The worst case arises because the two nested loops each run up to n times: 2n(n−1) comparisons, which is O(n2). Space complexity is O(1) — bubble sort rearranges the array in place, using only a single temporary variable for swaps.
Insertion sort grows a sorted region at the front of the array one element at a time. It takes the next unsorted element (the key) and inserts it into its correct place among the already-sorted elements by shifting larger ones to the right.
It is exactly how most people sort a hand of playing cards: you keep the cards you have arranged, pick up the next card, and slide it left until it sits in the right place.
# OCR-style pseudocode
procedure insertionSort(data, size)
for i = 1 to size - 1
key = data[i]
j = i - 1
while j >= 0 AND data[j] > key
data[j + 1] = data[j] // shift larger element right
j = j - 1
endwhile
data[j + 1] = key // drop key into the gap
next i
endprocedure
def insertion_sort(data):
for i in range(1, len(data)):
key = data[i]
j = i - 1
while j >= 0 and data[j] > key:
data[j + 1] = data[j]
j -= 1
data[j + 1] = key
return data
The vertical bar shows the boundary between the sorted region (left) and the unsorted region (right):
| i | key | Sorted region before | Shifts performed | Array after insertion |
|---|---|---|---|---|
| 1 | 3 | [5] | 5 > 3 → shift 5 | [3, 5 | 8, 1, 2] |
| 2 | 8 | [3, 5] | 8 > 5 → none | [3, 5, 8 | 1, 2] |
| 3 | 1 | [3, 5, 8] | 8,5,3 all > 1 → shift all three | [1, 3, 5, 8 | 2] |
| 4 | 2 | [1, 3, 5, 8] | 8,5,3 > 2; stop at 1 | [1, 2, 3, 5, 8] |
The fully sorted array is [1, 2, 3, 5, 8]. Notice insertion sort performs shifts, not swaps — and the number of shifts for each key equals how far it must travel, which is why nearly sorted data is so cheap to sort this way.
| Case | When | Comparisons | Complexity |
|---|---|---|---|
| Best | Already sorted | n−1 | O(n) |
| Average | Random order | about 4n(n−1) | O(n2) |
| Worst | Reverse sorted | 2n(n−1) | O(n2) |
In the best case each key needs just one comparison (it is already ≥ the element to its left) and the inner while loop exits immediately, giving O(n). In the worst case (reverse order) each key must travel all the way to the front, giving 2n(n−1) operations, i.e. O(n2). Space complexity is O(1) — like bubble sort, insertion sort works in place.
Both algorithms are O(n2) for the same structural reason, and the exam often asks you to explain it: each has a loop nested inside another loop, with both loops scaling with the input size n. The outer loop runs about n times, and for each of those iterations the inner loop runs up to about n times, so the total work is proportional to n×n=n2.
A precise count for the worst case is the sum of an arithmetic series. On the first pass of bubble sort the inner loop does n−1 comparisons, on the next n−2, and so on down to 1:
(n−1)+(n−2)+⋯+2+1=2n(n−1)
Expanding gives 2n2−n. As n grows, the n2 term dominates the −n term and the constant 21 is irrelevant to growth rate, so the complexity is reported as O(n2). This is the canonical worked example you will revisit in the Big-O lesson: a triangular sum of work collapses to a quadratic class.
The practical consequence is severe. Sorting 1,000 items by bubble or insertion sort takes on the order of 21000×999≈500,000 comparisons; sorting 1,000,000 items takes on the order of 5×1011 — half a trillion. A divide-and-conquer sort of complexity O(nlogn) would do the latter in about 2×107 operations, roughly 25,000 times fewer. This gulf — visible only at scale — is precisely why the next lesson's merge sort and quick sort exist, and why these two algorithms are confined to small or nearly sorted data in real systems.
It is worth stressing what the best case tells us. Insertion sort's O(n) best case is not a curiosity: real data is often "nearly sorted" (think of appending a few new records to an already-ordered file), and on such data insertion sort genuinely behaves linearly. The lesson is that worst-case Big-O is not the whole story — the distribution of likely inputs matters, which is why an algorithm's best and average cases are examined alongside its worst case.
| Feature | Bubble sort | Insertion sort |
|---|---|---|
| Best case | O(n) (with swap flag) | O(n) |
| Average case | O(n2) | O(n2) |
| Worst case | O(n2) | O(n2) |
| Space | O(1) in place | O(1) in place |
| Stable? | Yes | Yes |
| Adaptive? | Yes (only with the flag) | Yes (naturally) |
| Main operation | Swaps (3 writes each) | Shifts (1 write each) |
| Typical real-world use | Teaching; tiny lists | Small or nearly sorted lists; as a base case inside bigger sorts |
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.