Linear Search and Binary Search

Searching is one of the most fundamental operations in computing. Given a collection of data, we routinely need to determine whether a particular value exists and, if so, where it lies. At A-Level you must understand two searching algorithms — linear search and binary search — be able to compare their time complexity, trace their execution step by step, and justify when each is appropriate. The central lesson is that a single precondition, sortedness, transforms a linear-time search into a logarithmic-time one.

Spec Mapping

This lesson addresses AQA A-Level Computer Science (7517), Section 4.3 Fundamentals of algorithms, specifically 4.3.4 Searching algorithms. The specification requires you to understand and be able to trace linear search and binary search, and to know their time complexities. The work depends directly on the complexity vocabulary of 4.4.4 ( $O(n)$ vs $O(\log n)$ ) and connects to binary search trees (4.2.7), which generalise the halving idea into a data structure rather than an algorithm over an array.

Linear Search

How It Works

Linear search (also called sequential search) inspects each element in turn, starting from the first, until the target is found or the end of the list is reached. It makes no assumptions about the data — the elements may be in any order — which is its defining strength. There is no cleverness to it: it is the brute-force approach of checking everything, and its great virtue is that it always works and never requires the data to be prepared in advance. Every other searching technique can be seen as an attempt to avoid linear search's central weakness, namely that it may have to touch every element.

Pseudocode

FUNCTION linearSearch(arr, n, target)
    FOR i = 0 TO n - 1
        IF arr[i] = target THEN
            RETURN i
        END IF
    END FOR
    RETURN -1
END FUNCTION

A Python implementation makes the same logic concrete:

def linear_search(arr, target):
    for i in range(len(arr)):
        if arr[i] == target:
            return i          # found: return the index
    return -1                 # exhausted the list: not present

Trace Table Example

Searching for 42 in the list [17, 42, 8, 35, 91]:

Step	i	arr[i]	arr[i] = 42?	Action
1	0	17	No	Continue
2	1	42	Yes	Return 1

The algorithm finds 42 at index 1 after 2 comparisons. Now consider the worst case — searching for 99 in the same list:

Step	i	arr[i]	arr[i] = 99?	Action
1	0	17	No	Continue
2	1	42	No	Continue
3	2	8	No	Continue
4	3	35	No	Continue
5	4	91	No	End of list — return -1

Every one of the 5 elements is examined, illustrating that an absent target forces $n$ comparisons.

Complexity Analysis

Case	Comparisons	Big-O	Scenario
Best case	1	$O(1)$	Target is the first element
Worst case	$n$	$O(n)$	Target is last or not present
Average case	$\approx n/2$	$O(n)$	Target equally likely at any position

Advantages and Disadvantages

Advantages	Disadvantages
Works on unsorted data	Slow for large datasets
Trivial to implement correctly	Examines every element in the worst case
Works on any sequential structure (arrays, linked lists)	Unsuitable when fast repeated lookup is critical

Why Linear Search Still Matters

It is tempting to dismiss linear search as merely the "slow" option, but it is the only viable choice in several common situations, and understanding when it is appropriate is itself examinable. It is the right tool when the data is unsorted and will be searched only once or twice (sorting first would cost more than the search saves), when the dataset is small (the simplicity and low overhead outweigh any asymptotic advantage), when the structure does not support random access — a singly linked list cannot jump to its middle element, so binary search is impossible on it and linear search is the natural fit — and when you need to find every matching element rather than just one, since you must examine the whole list regardless. Linear search also degrades gracefully: it never requires the data to satisfy a precondition and never gives a wrong answer, whereas binary search silently misbehaves on unsorted data.

Counting Operations Precisely

To see why linear search is $O(n)$ rigorously, count the comparisons. If the target is at position $k$ (0-indexed), the loop performs $k + 1$ comparisons. Averaging over all $n$ equally likely positions, and over the case where the target is absent (which costs $n$ comparisons), the expected number of comparisons is approximately $\tfrac{n+1}{2}$ when the element is present. Dropping the constant $\tfrac{1}{2}$ and the $+\tfrac{1}{2}$ leaves $O(n)$ . The worst case, $n$ comparisons, is reached whenever the target is the last element or is absent altogether.

Binary Search

How It Works

Binary search is a divide-and-conquer algorithm that works only on sorted data. It repeatedly halves the search space by comparing the target with the middle element and discarding the half that cannot contain it. The sorted precondition is not optional — applied to unsorted data the algorithm gives meaningless results.

The intuition is the way you would look up a word in a physical dictionary. You do not start at page one and read every word (that would be linear search); you open near the middle, see whether your word comes before or after the words on that page, and immediately ignore the half of the dictionary that cannot contain it. Repeating this — open the middle of the remaining half, discard again — converges on the target in a handful of steps even for a dictionary of a hundred thousand words. The whole method works only because the dictionary is in alphabetical order; on a shuffled dictionary the "before or after" deduction would be worthless. That dependence on ordering is the single most important property of the algorithm.

Algorithm Steps

Set low = 0 and high = $n - 1$ .
Calculate mid = (low + high) DIV 2.
If arr[mid] = target, return mid.
If target < arr[mid], set high = mid - 1 (discard the right half).
If target > arr[mid], set low = mid + 1 (discard the left half).
Repeat from step 2 until low > high (target not found).

Pseudocode

FUNCTION binarySearch(arr, n, target)
    low = 0
    high = n - 1
    WHILE low <= high
        mid = (low + high) DIV 2
        IF arr[mid] = target THEN
            RETURN mid
        ELSE IF target < arr[mid] THEN
            high = mid - 1
        ELSE
            low = mid + 1
        END IF
    END WHILE
    RETURN -1
END FUNCTION

The same algorithm in Python, with comments highlighting the three outcomes of each comparison:

def binary_search(arr, target):
    low, high = 0, len(arr) - 1
    while low <= high:
        mid = (low + high) // 2     # integer division
        if arr[mid] == target:
            return mid               # found
        elif target < arr[mid]:
            high = mid - 1           # discard the right half
        else:
            low = mid + 1            # discard the left half
    return -1                        # window collapsed: not present

Three details are worth dwelling on. First, mid = (low + high) // 2 uses integer division, so mid is always a valid index. Second, the boundary updates use mid - 1 and mid + 1, not mid: since arr[mid] has already been compared and rejected, re-including it would risk an infinite loop where the window never shrinks. Third, the loop condition is low <= high (with $\le$ , not $<$ ); the case low == high represents a window of exactly one element, which must still be checked before concluding the target is absent.

Trace Table Example

Searching for 56 in the sorted list [8, 17, 23, 35, 42, 56, 71, 89] (indices 0–7):

Step	low	high	mid	arr[mid]	Comparison	Action
1	0	7	3	35	56 > 35	low = 4
2	4	7	5	56	56 = 56	Found! Return 5

Now trace a value that is absent — searching for 50:

Step	low	high	mid	arr[mid]	Comparison	Action
1	0	7	3	35	50 > 35	low = 4
2	4	7	5	56	50 < 56	high = 4
3	4	4	4	42	50 > 42	low = 5
4	—	—	—	—	low (5) > high (4)	Return -1

After 3 comparisons the window has collapsed (low overtakes high) and the algorithm correctly reports the value is absent. This worst-case trace on 8 elements takes 3 comparisons, and $\log_2 8 = 3$ — a perfect illustration of the logarithmic bound.

Complexity Analysis

Case	Comparisons	Big-O
Best case	1 (target is the first middle element)	$O(1)$
Worst case	$\lfloor \log_2 n \rfloor + 1$	$O(\log n)$
Average case	$\approx \log_2 n$	$O(\log n)$

Each comparison eliminates half of the remaining elements. For a list of 1,000,000 items, binary search needs at most about 20 comparisons (since $\log_2 1{,}000{,}000 \approx 20$ ), compared with 1,000,000 for linear search in the worst case.

Why O(log n)?

After each comparison the search space is halved:

$n \to \frac{n}{2} \to \frac{n}{4} \to \frac{n}{8} \to \dots \to 1.$

The number of halvings $k$ needed to reduce $n$ to 1 satisfies $\frac{n}{2^k} = 1$ , i.e. $2^k = n$ , i.e. $k = \log_2 n$ . Because the count of comparisons equals the count of halvings, binary search is $O(\log n)$ .

To make the staggering efficiency concrete: doubling the size of the dataset adds only one extra comparison to the worst case. A list of 1,000 needs about 10 comparisons; a list of 2,000 needs about 11; a list of 1,000,000 needs about 20; and a list of one billion needs only about 30. Linear search on those same lists would need 1,000, 2,000, 1,000,000 and one billion comparisons respectively. This is the single most dramatic illustration on the specification of why algorithmic order, not hardware, governs scalability.

Why the Sorted Precondition Is Non-Negotiable

Binary search's correctness depends entirely on one invariant: if target < arr[mid], then the target — if present at all — must lie in the left half. That deduction is only valid when the array is sorted. On unsorted data the deduction is false, so binary search can discard the very half containing the target and wrongly report it absent. For example, searching the unsorted list [9, 2, 7, 1, 5] for 7: with low = 0, high = 4, mid = 2, arr[mid] = 7 — here it happens to find it, but searching for 9 gives mid = 2, arr[mid] = 7, and since $9 > 7$ it discards indices 0–2 and never examines the 9 sitting at index 0. The algorithm returns "not found" for an element that is plainly present. The lesson is stark: applying binary search to unsorted data does not merely make it slow, it makes it wrong. Whenever you propose binary search in an exam, you must state the sorted precondition explicitly.

Comparison Table

Feature	Linear Search	Binary Search
Requires sorted data?	No	Yes
Time complexity (worst)	$O(n)$	$O(\log n)$
Time complexity (best)	$O(1)$	$O(1)$
Space complexity	$O(1)$	$O(1)$ iterative, $O(\log n)$ recursive
Suitable for	Small or unsorted datasets	Large sorted datasets, repeated lookups
Approach	Sequential	Divide and conquer

Note in the table that the two algorithms tie on best case ( $O(1)$ ) and on iterative space ( $O(1)$ ); the decisive difference is the worst case, $O(\log n)$ versus $O(n)$ , and the precondition that binary search demands sorted data. When you are asked to compare them, lead with the worst-case orders, then state the precondition as the price binary search pays for its speed.

Exam Tip: A very common question is: "Explain why binary search is more efficient than linear search for large sorted datasets." You must reference the time complexities — $O(\log n)$ vs $O(n)$ — and explain the mechanism: binary search eliminates half the remaining data at each comparison, whereas linear search eliminates only one element. A bare statement of the orders with no mechanism rarely earns full marks, and forgetting to mention the sorted precondition is a frequent way that otherwise correct answers lose a mark.

Linear Search and Binary Search

Linear Search and Binary Search

Spec Mapping

Linear Search

How It Works

Pseudocode

Trace Table Example

Complexity Analysis

Advantages and Disadvantages

Why Linear Search Still Matters

Counting Operations Precisely

Binary Search

How It Works

Algorithm Steps

Pseudocode

Trace Table Example

Complexity Analysis

Why O(log n)?

Why the Sorted Precondition Is Non-Negotiable

Comparison Table

A Larger Trace

More in Computer Science