You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Searching is one of the most fundamental operations in computing. Given a collection of data, we routinely need to determine whether a particular value exists and, if so, where it lies. At A-Level you must understand two searching algorithms — linear search and binary search — be able to compare their time complexity, trace their execution step by step, and justify when each is appropriate. The central lesson is that a single precondition, sortedness, transforms a linear-time search into a logarithmic-time one.
This lesson addresses AQA A-Level Computer Science (7517), Section 4.3 Fundamentals of algorithms, specifically 4.3.4 Searching algorithms. The specification requires you to understand and be able to trace linear search and binary search, and to know their time complexities. The work depends directly on the complexity vocabulary of 4.4.4 (O(n) vs O(logn)) and connects to binary search trees (4.2.7), which generalise the halving idea into a data structure rather than an algorithm over an array.
Linear search (also called sequential search) inspects each element in turn, starting from the first, until the target is found or the end of the list is reached. It makes no assumptions about the data — the elements may be in any order — which is its defining strength. There is no cleverness to it: it is the brute-force approach of checking everything, and its great virtue is that it always works and never requires the data to be prepared in advance. Every other searching technique can be seen as an attempt to avoid linear search's central weakness, namely that it may have to touch every element.
FUNCTION linearSearch(arr, n, target)
FOR i = 0 TO n - 1
IF arr[i] = target THEN
RETURN i
END IF
END FOR
RETURN -1
END FUNCTION
A Python implementation makes the same logic concrete:
def linear_search(arr, target):
for i in range(len(arr)):
if arr[i] == target:
return i # found: return the index
return -1 # exhausted the list: not present
Searching for 42 in the list [17, 42, 8, 35, 91]:
| Step | i | arr[i] | arr[i] = 42? | Action |
|---|---|---|---|---|
| 1 | 0 | 17 | No | Continue |
| 2 | 1 | 42 | Yes | Return 1 |
The algorithm finds 42 at index 1 after 2 comparisons. Now consider the worst case — searching for 99 in the same list:
| Step | i | arr[i] | arr[i] = 99? | Action |
|---|---|---|---|---|
| 1 | 0 | 17 | No | Continue |
| 2 | 1 | 42 | No | Continue |
| 3 | 2 | 8 | No | Continue |
| 4 | 3 | 35 | No | Continue |
| 5 | 4 | 91 | No | End of list — return -1 |
Every one of the 5 elements is examined, illustrating that an absent target forces n comparisons.
| Case | Comparisons | Big-O | Scenario |
|---|---|---|---|
| Best case | 1 | O(1) | Target is the first element |
| Worst case | n | O(n) | Target is last or not present |
| Average case | ≈n/2 | O(n) | Target equally likely at any position |
| Advantages | Disadvantages |
|---|---|
| Works on unsorted data | Slow for large datasets |
| Trivial to implement correctly | Examines every element in the worst case |
| Works on any sequential structure (arrays, linked lists) | Unsuitable when fast repeated lookup is critical |
It is tempting to dismiss linear search as merely the "slow" option, but it is the only viable choice in several common situations, and understanding when it is appropriate is itself examinable. It is the right tool when the data is unsorted and will be searched only once or twice (sorting first would cost more than the search saves), when the dataset is small (the simplicity and low overhead outweigh any asymptotic advantage), when the structure does not support random access — a singly linked list cannot jump to its middle element, so binary search is impossible on it and linear search is the natural fit — and when you need to find every matching element rather than just one, since you must examine the whole list regardless. Linear search also degrades gracefully: it never requires the data to satisfy a precondition and never gives a wrong answer, whereas binary search silently misbehaves on unsorted data.
To see why linear search is O(n) rigorously, count the comparisons. If the target is at position k (0-indexed), the loop performs k+1 comparisons. Averaging over all n equally likely positions, and over the case where the target is absent (which costs n comparisons), the expected number of comparisons is approximately 2n+1 when the element is present. Dropping the constant 21 and the +21 leaves O(n). The worst case, n comparisons, is reached whenever the target is the last element or is absent altogether.
Binary search is a divide-and-conquer algorithm that works only on sorted data. It repeatedly halves the search space by comparing the target with the middle element and discarding the half that cannot contain it. The sorted precondition is not optional — applied to unsorted data the algorithm gives meaningless results.
The intuition is the way you would look up a word in a physical dictionary. You do not start at page one and read every word (that would be linear search); you open near the middle, see whether your word comes before or after the words on that page, and immediately ignore the half of the dictionary that cannot contain it. Repeating this — open the middle of the remaining half, discard again — converges on the target in a handful of steps even for a dictionary of a hundred thousand words. The whole method works only because the dictionary is in alphabetical order; on a shuffled dictionary the "before or after" deduction would be worthless. That dependence on ordering is the single most important property of the algorithm.
FUNCTION binarySearch(arr, n, target)
low = 0
high = n - 1
WHILE low <= high
mid = (low + high) DIV 2
IF arr[mid] = target THEN
RETURN mid
ELSE IF target < arr[mid] THEN
high = mid - 1
ELSE
low = mid + 1
END IF
END WHILE
RETURN -1
END FUNCTION
The same algorithm in Python, with comments highlighting the three outcomes of each comparison:
def binary_search(arr, target):
low, high = 0, len(arr) - 1
while low <= high:
mid = (low + high) // 2 # integer division
if arr[mid] == target:
return mid # found
elif target < arr[mid]:
high = mid - 1 # discard the right half
else:
low = mid + 1 # discard the left half
return -1 # window collapsed: not present
Three details are worth dwelling on. First, mid = (low + high) // 2 uses integer division, so mid is always a valid index. Second, the boundary updates use mid - 1 and mid + 1, not mid: since arr[mid] has already been compared and rejected, re-including it would risk an infinite loop where the window never shrinks. Third, the loop condition is low <= high (with ≤, not <); the case low == high represents a window of exactly one element, which must still be checked before concluding the target is absent.
Searching for 56 in the sorted list [8, 17, 23, 35, 42, 56, 71, 89] (indices 0–7):
| Step | low | high | mid | arr[mid] | Comparison | Action |
|---|---|---|---|---|---|---|
| 1 | 0 | 7 | 3 | 35 | 56 > 35 | low = 4 |
| 2 | 4 | 7 | 5 | 56 | 56 = 56 | Found! Return 5 |
Now trace a value that is absent — searching for 50:
| Step | low | high | mid | arr[mid] | Comparison | Action |
|---|---|---|---|---|---|---|
| 1 | 0 | 7 | 3 | 35 | 50 > 35 | low = 4 |
| 2 | 4 | 7 | 5 | 56 | 50 < 56 | high = 4 |
| 3 | 4 | 4 | 4 | 42 | 50 > 42 | low = 5 |
| 4 | — | — | — | — | low (5) > high (4) | Return -1 |
After 3 comparisons the window has collapsed (low overtakes high) and the algorithm correctly reports the value is absent. This worst-case trace on 8 elements takes 3 comparisons, and log28=3 — a perfect illustration of the logarithmic bound.
| Case | Comparisons | Big-O |
|---|---|---|
| Best case | 1 (target is the first middle element) | O(1) |
| Worst case | ⌊log2n⌋+1 | O(logn) |
| Average case | ≈log2n | O(logn) |
Each comparison eliminates half of the remaining elements. For a list of 1,000,000 items, binary search needs at most about 20 comparisons (since log21,000,000≈20), compared with 1,000,000 for linear search in the worst case.
After each comparison the search space is halved:
n→2n→4n→8n→⋯→1.
The number of halvings k needed to reduce n to 1 satisfies 2kn=1, i.e. 2k=n, i.e. k=log2n. Because the count of comparisons equals the count of halvings, binary search is O(logn).
To make the staggering efficiency concrete: doubling the size of the dataset adds only one extra comparison to the worst case. A list of 1,000 needs about 10 comparisons; a list of 2,000 needs about 11; a list of 1,000,000 needs about 20; and a list of one billion needs only about 30. Linear search on those same lists would need 1,000, 2,000, 1,000,000 and one billion comparisons respectively. This is the single most dramatic illustration on the specification of why algorithmic order, not hardware, governs scalability.
Binary search's correctness depends entirely on one invariant: if target < arr[mid], then the target — if present at all — must lie in the left half. That deduction is only valid when the array is sorted. On unsorted data the deduction is false, so binary search can discard the very half containing the target and wrongly report it absent. For example, searching the unsorted list [9, 2, 7, 1, 5] for 7: with low = 0, high = 4, mid = 2, arr[mid] = 7 — here it happens to find it, but searching for 9 gives mid = 2, arr[mid] = 7, and since 9>7 it discards indices 0–2 and never examines the 9 sitting at index 0. The algorithm returns "not found" for an element that is plainly present. The lesson is stark: applying binary search to unsorted data does not merely make it slow, it makes it wrong. Whenever you propose binary search in an exam, you must state the sorted precondition explicitly.
| Feature | Linear Search | Binary Search |
|---|---|---|
| Requires sorted data? | No | Yes |
| Time complexity (worst) | O(n) | O(logn) |
| Time complexity (best) | O(1) | O(1) |
| Space complexity | O(1) | O(1) iterative, O(logn) recursive |
| Suitable for | Small or unsorted datasets | Large sorted datasets, repeated lookups |
| Approach | Sequential | Divide and conquer |
Note in the table that the two algorithms tie on best case (O(1)) and on iterative space (O(1)); the decisive difference is the worst case, O(logn) versus O(n), and the precondition that binary search demands sorted data. When you are asked to compare them, lead with the worst-case orders, then state the precondition as the price binary search pays for its speed.
Exam Tip: A very common question is: "Explain why binary search is more efficient than linear search for large sorted datasets." You must reference the time complexities — O(logn) vs O(n) — and explain the mechanism: binary search eliminates half the remaining data at each comparison, whereas linear search eliminates only one element. A bare statement of the orders with no mechanism rarely earns full marks, and forgetting to mention the sorted precondition is a frequent way that otherwise correct answers lose a mark.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.