You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
These three algorithms are the simple comparison sorts. All three are O(n2) in the worst case, making them unsuitable for very large datasets, yet they remain essential to understand: they illustrate the fundamental ideas of comparison-based sorting, are straightforward to trace by hand, and appear in A-Level exams every year. Bubble sort is the comparison sort named explicitly in the AQA specification, so you must be able to trace it precisely; insertion and selection sort are studied alongside it for comparison.
Sorting is one of the most studied problems in all of computing, partly because it is needed everywhere — rendering a leaderboard, preparing data for binary search, grouping records, removing duplicates — and partly because it is a perfect vehicle for the central ideas of algorithm design: comparing operation counts, distinguishing best from worst cases, reasoning about stability, and weighing time against space. The three sorts here all belong to the comparison family, meaning they decide the final order purely by comparing pairs of elements; none of them looks inside a value to exploit its structure. That shared mechanism is exactly why they share the same O(n2) ceiling, and understanding why a comparison sort cannot do better than O(n2) with this simple structure motivates the cleverer O(nlogn) sorts you meet in the next lesson.
This lesson addresses AQA A-Level Computer Science (7517), Section 4.3 Fundamentals of algorithms, specifically 4.3.5 Sorting algorithms. The specification names bubble sort and merge sort as the two sorts you must know in detail; insertion and selection sort are included here because they share the O(n2) comparison-sort family and are commonly used as contrast questions. You must be able to trace each algorithm and state its time complexity, which draws on the orders defined in 4.4.4.
Bubble sort makes repeated passes through the list. On each pass it compares adjacent pairs and swaps them when they are out of order. After each pass the largest remaining element "bubbles" to its correct place at the end, so the unsorted region shrinks by one each time. An optimised version stops early using a swapped flag: if a whole pass makes no swaps, the list is already sorted.
The name captures the behaviour well: on each pass a large value, once encountered, keeps being swapped rightward until it meets a still-larger value or reaches the end — like a bubble rising to the surface. A subtle consequence is that large values move toward their final positions quickly (a big value can travel the whole length of the array in a single pass), whereas small values near the end of the list move left only one position per pass. These slow-moving small values are nicknamed "turtles" and are the reason bubble sort is often the slowest of the three in practice despite sharing their order.
PROCEDURE bubbleSort(arr, n)
FOR i = 0 TO n - 2
swapped = FALSE
FOR j = 0 TO n - 2 - i
IF arr[j] > arr[j + 1] THEN
SWAP arr[j], arr[j + 1]
swapped = TRUE
END IF
END FOR
IF NOT swapped THEN
RETURN
END IF
END FOR
END PROCEDURE
Sorting [5, 3, 8, 1, 2]:
Pass 1 (inner loop j=0→3):
| j | Comparison | Swap? | Array after |
|---|---|---|---|
| 0 | 5 > 3? Yes | Swap | [3, 5, 8, 1, 2] |
| 1 | 5 > 8? No | — | [3, 5, 8, 1, 2] |
| 2 | 8 > 1? Yes | Swap | [3, 5, 1, 8, 2] |
| 3 | 8 > 2? Yes | Swap | [3, 5, 1, 2, 8] |
After Pass 1 the largest element (8) is in its final position.
Pass 2 (j=0→2):
| j | Comparison | Swap? | Array after |
|---|---|---|---|
| 0 | 3 > 5? No | — | [3, 5, 1, 2, 8] |
| 1 | 5 > 1? Yes | Swap | [3, 1, 5, 2, 8] |
| 2 | 5 > 2? Yes | Swap | [3, 1, 2, 5, 8] |
Pass 3 (j=0→1):
| j | Comparison | Swap? | Array after |
|---|---|---|---|
| 0 | 3 > 1? Yes | Swap | [1, 3, 2, 5, 8] |
| 1 | 3 > 2? Yes | Swap | [1, 2, 3, 5, 8] |
Pass 4 (j=0):
| j | Comparison | Swap? | Array after |
|---|---|---|---|
| 0 | 1 > 2? No | — | [1, 2, 3, 5, 8] |
No swaps occurred on Pass 4, so the swapped flag stays FALSE and the algorithm terminates. Final result: [1, 2, 3, 5, 8].
The same algorithm in Python, with the early-exit optimisation:
def bubble_sort(arr):
n = len(arr)
for i in range(n - 1):
swapped = False
for j in range(n - 1 - i): # the "- i" skips the sorted tail
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j]
swapped = True
if not swapped: # a clean pass means already sorted
break
return arr
Notice the inner range n - 1 - i: after pass i the last i elements are already in their final places, so there is no point comparing them again. This optimisation does not change the O(n2) order but roughly halves the constant factor, and the swapped flag is what gives bubble sort its O(n) best case on already-sorted input.
| Case | Time | Explanation |
|---|---|---|
| Best | O(n) | List already sorted — a single pass makes no swaps and the flag halts it |
| Worst | O(n2) | Reverse-sorted — every comparison swaps; 2n(n−1) comparisons |
| Average | O(n2) | Random order still yields a quadratic number of comparisons |
| Space | O(1) | Sorts in place — only a temporary variable for the swap |
The O(n2) count comes from the same triangular sum seen elsewhere: Pass 1 does n−1 comparisons, Pass 2 does n−2, and so on, giving (n−1)+(n−2)+⋯+1=2n(n−1).
Insertion sort builds the sorted list one element at a time. It takes each element (the key) and inserts it into its correct place within the already-sorted portion to its left, shifting larger elements one position right to make room. It is exactly how most people sort a hand of playing cards: pick up each new card and slide it into position among those already held.
The crucial design point is that the left-hand portion is always sorted: the algorithm maintains this as an invariant, growing the sorted region by one element on each iteration of the outer loop. Because the insertion point is found by walking leftward and stopping at the first element not greater than the key, the amount of work per element is proportional to how far out of place it is — almost none for an element already near its correct position. This adaptivity is precisely what makes insertion sort the fastest of the three on nearly sorted data and the natural choice for incremental sorting where elements arrive over time.
PROCEDURE insertionSort(arr, n)
FOR i = 1 TO n - 1
key = arr[i]
j = i - 1
WHILE j >= 0 AND arr[j] > key
arr[j + 1] = arr[j]
j = j - 1
END WHILE
arr[j + 1] = key
END FOR
END PROCEDURE
Sorting [5, 3, 8, 1, 2]. The shifts inside the inner WHILE loop are shown so the mechanism is visible:
| i | key | Sorted portion before | Shifts performed | Array after |
|---|---|---|---|---|
| 1 | 3 | [5] | 5 shifts right | [3, 5, 8, 1, 2] |
| 2 | 8 | [3, 5] | none (8 > 5) | [3, 5, 8, 1, 2] |
| 3 | 1 | [3, 5, 8] | 8, 5, 3 each shift right | [1, 3, 5, 8, 2] |
| 4 | 2 | [1, 3, 5, 8] | 8, 5, 3 shift right; stop at 1 | [1, 2, 3, 5, 8] |
Notice on the last step that the WHILE loop stops as soon as it meets the 1 (since 1>2 is false), so 2 is inserted directly after it. This early stopping is what makes insertion sort fast on nearly sorted data.
To see the worst case, trace insertion sort on a reverse-sorted list [4, 3, 2, 1], where every new element must shift past all those before it:
| i | key | Sorted portion before | Shifts | Array after |
|---|---|---|---|---|
| 1 | 3 | [4] | 4 shifts right | [3, 4, 2, 1] |
| 2 | 2 | [3, 4] | 4, 3 shift right | [2, 3, 4, 1] |
| 3 | 1 | [2, 3, 4] | 4, 3, 2 shift right | [1, 2, 3, 4] |
Here pass 1 does 1 shift, pass 2 does 2, and pass 3 does 3, totalling 1+2+3=6=2n(n−1) for n=4. This triangular sum is the fingerprint of O(n2), and it confirms that reverse-sorted input is insertion sort's worst case.
| Case | Time | Explanation |
|---|---|---|
| Best | O(n) | Already sorted — the WHILE condition is false immediately, so no shifts |
| Worst | O(n2) | Reverse-sorted — every element shifts past all those before it |
| Average | O(n2) | |
| Space | O(1) | In-place |
Insertion sort is adaptive and very efficient on nearly sorted data: if only a few elements are out of place, it approaches O(n). It is also stable — equal elements keep their original relative order, because an element is only shifted past those strictly greater than it (the condition is arr[j] > key, not >=).
Selection sort divides the list into a sorted prefix (left) and an unsorted suffix (right). On each pass it scans the entire unsorted suffix to find the minimum, then swaps that minimum into the first unsorted position, extending the sorted prefix by one. After k passes the first k positions hold the k smallest elements, in order, and they are never touched again.
The defining feature is that the scan to find each minimum always covers the whole remaining suffix, regardless of how the data is arranged. This is why selection sort cannot exploit a nearly sorted input and has no O(n) best case: even if the list is already sorted, it still scans every remaining element on every pass just to confirm the minimum is where it expects. What it economises on is movement — exactly one swap per pass — which is the trade-off that defines its niche.
PROCEDURE selectionSort(arr, n)
FOR i = 0 TO n - 2
minIndex = i
FOR j = i + 1 TO n - 1
IF arr[j] < arr[minIndex] THEN
minIndex = j
END IF
END FOR
SWAP arr[i], arr[minIndex]
END FOR
END PROCEDURE
Sorting [5, 3, 8, 1, 2]:
| Pass (i) | Unsorted region | Minimum found | Swap | Array after |
|---|---|---|---|---|
| 0 | [5, 3, 8, 1, 2] | 1 at index 3 | arr[0] ↔ arr[3] | [1, 3, 8, 5, 2] |
| 1 | [3, 8, 5, 2] | 2 at index 4 | arr[1] ↔ arr[4] | [1, 2, 8, 5, 3] |
| 2 | [8, 5, 3] | 3 at index 4 | arr[2] ↔ arr[4] | [1, 2, 3, 5, 8] |
| 3 | [5, 8] | 5 at index 3 | no move needed | [1, 2, 3, 5, 8] |
| Case | Time | Explanation |
|---|---|---|
| Best | O(n2) | The inner scan always covers the whole unsorted region |
| Worst | O(n2) | Identical to best — input order is irrelevant |
| Average | O(n2) | |
| Space | O(1) | In-place |
Selection sort always performs 2n(n−1) comparisons regardless of input order — it cannot exploit a partially sorted list. Its one redeeming feature is that it makes at most n−1 swaps (one per pass), which matters when writing data is far more expensive than comparing it (for example writing to flash memory with limited write cycles).
Stability is easiest to grasp through a concrete example. Suppose we sort a list of (grade, name) pairs by grade only, starting from this input order:
| Position | Grade | Name |
|---|---|---|
| 0 | B | Alice |
| 1 | A | Bob |
| 2 | B | Carol |
| 3 | A | Dan |
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.