Bubble Sort, Insertion Sort, and Selection Sort

These three algorithms are the simple comparison sorts. All three are $O(n^2)$ in the worst case, making them unsuitable for very large datasets, yet they remain essential to understand: they illustrate the fundamental ideas of comparison-based sorting, are straightforward to trace by hand, and appear in A-Level exams every year. Bubble sort is the comparison sort named explicitly in the AQA specification, so you must be able to trace it precisely; insertion and selection sort are studied alongside it for comparison.

Sorting is one of the most studied problems in all of computing, partly because it is needed everywhere — rendering a leaderboard, preparing data for binary search, grouping records, removing duplicates — and partly because it is a perfect vehicle for the central ideas of algorithm design: comparing operation counts, distinguishing best from worst cases, reasoning about stability, and weighing time against space. The three sorts here all belong to the comparison family, meaning they decide the final order purely by comparing pairs of elements; none of them looks inside a value to exploit its structure. That shared mechanism is exactly why they share the same $O(n^2)$ ceiling, and understanding why a comparison sort cannot do better than $O(n^2)$ with this simple structure motivates the cleverer $O(n \log n)$ sorts you meet in the next lesson.

Spec Mapping

This lesson addresses AQA A-Level Computer Science (7517), Section 4.3 Fundamentals of algorithms, specifically 4.3.5 Sorting algorithms. The specification names bubble sort and merge sort as the two sorts you must know in detail; insertion and selection sort are included here because they share the $O(n^2)$ comparison-sort family and are commonly used as contrast questions. You must be able to trace each algorithm and state its time complexity, which draws on the orders defined in 4.4.4.

Bubble Sort

How It Works

Bubble sort makes repeated passes through the list. On each pass it compares adjacent pairs and swaps them when they are out of order. After each pass the largest remaining element "bubbles" to its correct place at the end, so the unsorted region shrinks by one each time. An optimised version stops early using a swapped flag: if a whole pass makes no swaps, the list is already sorted.

The name captures the behaviour well: on each pass a large value, once encountered, keeps being swapped rightward until it meets a still-larger value or reaches the end — like a bubble rising to the surface. A subtle consequence is that large values move toward their final positions quickly (a big value can travel the whole length of the array in a single pass), whereas small values near the end of the list move left only one position per pass. These slow-moving small values are nicknamed "turtles" and are the reason bubble sort is often the slowest of the three in practice despite sharing their order.

Pseudocode

PROCEDURE bubbleSort(arr, n)
    FOR i = 0 TO n - 2
        swapped = FALSE
        FOR j = 0 TO n - 2 - i
            IF arr[j] > arr[j + 1] THEN
                SWAP arr[j], arr[j + 1]
                swapped = TRUE
            END IF
        END FOR
        IF NOT swapped THEN
            RETURN
        END IF
    END FOR
END PROCEDURE

Trace Table

Sorting [5, 3, 8, 1, 2]:

Pass 1 (inner loop $j = 0 \to 3$ ):

j	Comparison	Swap?	Array after
0	5 > 3? Yes	Swap	[3, 5, 8, 1, 2]
1	5 > 8? No	—	[3, 5, 8, 1, 2]
2	8 > 1? Yes	Swap	[3, 5, 1, 8, 2]
3	8 > 2? Yes	Swap	[3, 5, 1, 2, 8]

After Pass 1 the largest element (8) is in its final position.

Pass 2 ( $j = 0 \to 2$ ):

j	Comparison	Swap?	Array after
0	3 > 5? No	—	[3, 5, 1, 2, 8]
1	5 > 1? Yes	Swap	[3, 1, 5, 2, 8]
2	5 > 2? Yes	Swap	[3, 1, 2, 5, 8]

Pass 3 ( $j = 0 \to 1$ ):

j	Comparison	Swap?	Array after
0	3 > 1? Yes	Swap	[1, 3, 2, 5, 8]
1	3 > 2? Yes	Swap	[1, 2, 3, 5, 8]

Pass 4 ( $j = 0$ ):

j	Comparison	Swap?	Array after
0	1 > 2? No	—	[1, 2, 3, 5, 8]

No swaps occurred on Pass 4, so the swapped flag stays FALSE and the algorithm terminates. Final result: [1, 2, 3, 5, 8].

The same algorithm in Python, with the early-exit optimisation:

def bubble_sort(arr):
    n = len(arr)
    for i in range(n - 1):
        swapped = False
        for j in range(n - 1 - i):       # the "- i" skips the sorted tail
            if arr[j] > arr[j + 1]:
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
                swapped = True
        if not swapped:                  # a clean pass means already sorted
            break
    return arr

Notice the inner range n - 1 - i: after pass $i$ the last $i$ elements are already in their final places, so there is no point comparing them again. This optimisation does not change the $O(n^2)$ order but roughly halves the constant factor, and the swapped flag is what gives bubble sort its $O(n)$ best case on already-sorted input.

Complexity

Case	Time	Explanation
Best	$O(n)$	List already sorted — a single pass makes no swaps and the flag halts it
Worst	$O(n^2)$	Reverse-sorted — every comparison swaps; $\tfrac{n(n-1)}{2}$ comparisons
Average	$O(n^2)$	Random order still yields a quadratic number of comparisons
Space	$O(1)$	Sorts in place — only a temporary variable for the swap

The $O(n^2)$ count comes from the same triangular sum seen elsewhere: Pass 1 does $n-1$ comparisons, Pass 2 does $n-2$ , and so on, giving $(n-1)+(n-2)+\dots+1 = \tfrac{n(n-1)}{2}$ .

Insertion Sort

How It Works

Insertion sort builds the sorted list one element at a time. It takes each element (the key) and inserts it into its correct place within the already-sorted portion to its left, shifting larger elements one position right to make room. It is exactly how most people sort a hand of playing cards: pick up each new card and slide it into position among those already held.

The crucial design point is that the left-hand portion is always sorted: the algorithm maintains this as an invariant, growing the sorted region by one element on each iteration of the outer loop. Because the insertion point is found by walking leftward and stopping at the first element not greater than the key, the amount of work per element is proportional to how far out of place it is — almost none for an element already near its correct position. This adaptivity is precisely what makes insertion sort the fastest of the three on nearly sorted data and the natural choice for incremental sorting where elements arrive over time.

Pseudocode

PROCEDURE insertionSort(arr, n)
    FOR i = 1 TO n - 1
        key = arr[i]
        j = i - 1
        WHILE j >= 0 AND arr[j] > key
            arr[j + 1] = arr[j]
            j = j - 1
        END WHILE
        arr[j + 1] = key
    END FOR
END PROCEDURE

Trace Table

Sorting [5, 3, 8, 1, 2]. The shifts inside the inner WHILE loop are shown so the mechanism is visible:

i	key	Sorted portion before	Shifts performed	Array after
1	3	[5]	5 shifts right	[3, 5, 8, 1, 2]
2	8	[3, 5]	none (8 > 5)	[3, 5, 8, 1, 2]
3	1	[3, 5, 8]	8, 5, 3 each shift right	[1, 3, 5, 8, 2]
4	2	[1, 3, 5, 8]	8, 5, 3 shift right; stop at 1	[1, 2, 3, 5, 8]

Notice on the last step that the WHILE loop stops as soon as it meets the 1 (since $1 > 2$ is false), so 2 is inserted directly after it. This early stopping is what makes insertion sort fast on nearly sorted data.

To see the worst case, trace insertion sort on a reverse-sorted list [4, 3, 2, 1], where every new element must shift past all those before it:

i	key	Sorted portion before	Shifts	Array after
1	3	[4]	4 shifts right	[3, 4, 2, 1]
2	2	[3, 4]	4, 3 shift right	[2, 3, 4, 1]
3	1	[2, 3, 4]	4, 3, 2 shift right	[1, 2, 3, 4]

Here pass 1 does 1 shift, pass 2 does 2, and pass 3 does 3, totalling $1 + 2 + 3 = 6 = \tfrac{n(n-1)}{2}$ for $n = 4$ . This triangular sum is the fingerprint of $O(n^2)$ , and it confirms that reverse-sorted input is insertion sort's worst case.

Complexity

Case	Time	Explanation
Best	$O(n)$	Already sorted — the WHILE condition is false immediately, so no shifts
Worst	$O(n^2)$	Reverse-sorted — every element shifts past all those before it
Average	$O(n^2)$
Space	$O(1)$	In-place

Key Advantage

Insertion sort is adaptive and very efficient on nearly sorted data: if only a few elements are out of place, it approaches $O(n)$ . It is also stable — equal elements keep their original relative order, because an element is only shifted past those strictly greater than it (the condition is arr[j] > key, not >=).

Selection Sort

How It Works

Selection sort divides the list into a sorted prefix (left) and an unsorted suffix (right). On each pass it scans the entire unsorted suffix to find the minimum, then swaps that minimum into the first unsorted position, extending the sorted prefix by one. After $k$ passes the first $k$ positions hold the $k$ smallest elements, in order, and they are never touched again.

The defining feature is that the scan to find each minimum always covers the whole remaining suffix, regardless of how the data is arranged. This is why selection sort cannot exploit a nearly sorted input and has no $O(n)$ best case: even if the list is already sorted, it still scans every remaining element on every pass just to confirm the minimum is where it expects. What it economises on is movement — exactly one swap per pass — which is the trade-off that defines its niche.

Pseudocode

PROCEDURE selectionSort(arr, n)
    FOR i = 0 TO n - 2
        minIndex = i
        FOR j = i + 1 TO n - 1
            IF arr[j] < arr[minIndex] THEN
                minIndex = j
            END IF
        END FOR
        SWAP arr[i], arr[minIndex]
    END FOR
END PROCEDURE

Trace Table

Sorting [5, 3, 8, 1, 2]:

Pass (i)	Unsorted region	Minimum found	Swap	Array after
0	[5, 3, 8, 1, 2]	1 at index 3	arr[0] ↔ arr[3]	[1, 3, 8, 5, 2]
1	[3, 8, 5, 2]	2 at index 4	arr[1] ↔ arr[4]	[1, 2, 8, 5, 3]
2	[8, 5, 3]	3 at index 4	arr[2] ↔ arr[4]	[1, 2, 3, 5, 8]
3	[5, 8]	5 at index 3	no move needed	[1, 2, 3, 5, 8]

Complexity

Case	Time	Explanation
Best	$O(n^2)$	The inner scan always covers the whole unsorted region
Worst	$O(n^2)$	Identical to best — input order is irrelevant
Average	$O(n^2)$
Space	$O(1)$	In-place

Selection sort always performs $\tfrac{n(n-1)}{2}$ comparisons regardless of input order — it cannot exploit a partially sorted list. Its one redeeming feature is that it makes at most $n-1$ swaps (one per pass), which matters when writing data is far more expensive than comparing it (for example writing to flash memory with limited write cycles).

Stability, Demonstrated

Stability is easiest to grasp through a concrete example. Suppose we sort a list of (grade, name) pairs by grade only, starting from this input order:

Position	Grade	Name
0	B	Alice
1	A	Bob
2	B	Carol
3	A	Dan

Bubble Sort, Insertion Sort, and Selection Sort

Bubble Sort, Insertion Sort, and Selection Sort

Spec Mapping

Bubble Sort

How It Works

Pseudocode

Trace Table

Complexity

Insertion Sort

How It Works

Pseudocode

Trace Table

Complexity

Key Advantage

Selection Sort

How It Works

Pseudocode

Trace Table

Complexity

Stability, Demonstrated

More in Computer Science