You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Understanding how algorithms perform as input sizes grow is one of the most important skills in A-Level Computer Science. Algorithm complexity analysis lets us predict how an algorithm will behave for large datasets and compare competing solutions objectively, without ever running them on a specific machine. This matters precisely because hardware speed is irrelevant to the question examiners care about: how does the work the algorithm performs grow as the input gets bigger?
This lesson addresses AQA A-Level Computer Science (7517), Section 4.4 Theory of computation, specifically 4.4.4 Classification of algorithms. The specification requires you to compare the complexity of algorithms, derive the time complexity of a problem or algorithm, and use Big-O notation to express constant, linear, polynomial, exponential and logarithmic orders of time complexity. It links forward to ideas of tractability and the limits of computation (also 4.4.4). The distinction between time complexity and space complexity rests on the model of an algorithm as a finite sequence of operations introduced in 4.4.1.
Imagine two sorting algorithms. On a list of 10 items, both finish in under a millisecond. On a list of 10 million items, one finishes in 10 seconds while the other takes 3 hours. Buying a faster processor will not close that gap — doubling the clock speed merely halves both times, leaving the ratio between them unchanged. The difference is algorithmic complexity: the mathematical relationship between the size of the input, conventionally written n, and the number of basic operations the algorithm performs.
There are two reasons we measure complexity mathematically rather than just timing the code with a stopwatch. The first is machine independence. A stopwatch measurement on your laptop tells you nothing reliable about how the algorithm will behave on a server, a phone, or a machine built five years from now; it conflates the algorithm with the hardware, the programming language, the compiler and even how busy the machine was at the time. A complexity class such as O(n2), by contrast, is a property of the algorithm itself and holds on every machine. The second reason is predictiveness. Timing tells you what happened for the input you happened to test; complexity analysis tells you what will happen as the input grows without bound, which is exactly the regime where the choice of algorithm matters most. A program that is comfortably fast on the developer's test data of a few hundred records can become unusable once it is deployed against millions of real records — and complexity analysis is what lets you predict that disaster before it happens.
A useful mental model is to identify the basic operation that dominates the work — typically a comparison, a swap, or an arithmetic step inside the innermost loop — and ask how many times it executes as a function of n. We deliberately ignore the exact time each basic operation takes, because that varies by machine; we care only about how the count of operations grows.
In industry and in exams, you must be able to:
| Type | What it measures | Example |
|---|---|---|
| Time complexity | The number of basic operations as a function of input size n | How many comparisons a sorting algorithm makes |
| Space complexity | The amount of extra memory required as a function of n | The temporary arrays merge sort allocates while merging |
At A-Level the focus is primarily on time complexity, but you should recognise that algorithms often trade time for space. Merge sort, for example, buys its guaranteed O(nlogn) time at the cost of O(n) extra memory, whereas an in-place sort uses O(1) extra memory but may be slower. Space complexity counts only the additional storage an algorithm needs beyond the input itself — the input array is not counted, because every algorithm must store its input, so it tells us nothing about the algorithm's own appetite for memory.
The time–space trade-off is a recurring theme. A lookup table that pre-computes answers turns an expensive calculation into a fast O(1) array read, spending memory to save time. Conversely, an algorithm that recomputes intermediate results rather than storing them saves memory at the cost of extra time. There is rarely a single "best" algorithm: the right choice depends on whether the system is constrained by its processor or by its memory. An embedded controller with a few kilobytes of RAM may prefer a slow but in-place algorithm, while a data-centre server with abundant memory will happily trade space for speed. Recognising which resource is the bottleneck is a genuine engineering skill, and exam questions increasingly ask you to justify a choice in a given context rather than to name an abstractly "fastest" algorithm.
Big-O notation describes an upper bound on an algorithm's growth rate: it tells us the largest the running time can become as a function of n, ignoring constant factors. Formally, we say a function T(n) is O(f(n)) if there exist positive constants c and n0 such that T(n)≤c⋅f(n) for all n≥n0. In plain English: beyond some input size, T(n) never grows faster than a constant multiple of f(n).
We write the order as O(f(n)), where f(n) is a function of n such as n, n2 or logn.
These rules follow directly from the formal definition: a constant multiple is absorbed by the constant c, and a lower-order term is eventually overtaken by the dominant one once n≥n0.
The following table summarises the orders of complexity named in the specification:
| Big-O | Name | Example Algorithm | Growth behaviour |
|---|---|---|---|
| O(1) | Constant | Accessing an array element by index | Independent of n |
| O(logn) | Logarithmic | Binary search | Grows very slowly |
| O(n) | Linear | Linear search | Grows in proportion to n |
| O(nlogn) | Linearithmic | Merge sort, quick sort (average) | Just above linear |
| O(n2) | Polynomial (quadratic) | Bubble, insertion, selection sort | Grows rapidly |
| O(2n) | Exponential | Brute-force subset generation | Doubles per extra element |
| O(n!) | Factorial | Brute-force travelling salesman | Intractable almost immediately |
Note that "polynomial" covers any O(nk) for a fixed power k — O(n2) and O(n3) are both polynomial orders. The specification draws a sharp dividing line between polynomial orders (regarded as tractable) and exponential/factorial orders (regarded as intractable): a problem with no known polynomial-time solution becomes impossible to solve for large inputs no matter how fast the hardware.
Watch how dramatically the classes diverge as n increases:
| n | O(1) | O(logn) | O(n) | O(nlogn) | O(n2) | O(2n) |
|---|---|---|---|---|---|---|
| 1 | 1 | 0 | 1 | 0 | 1 | 2 |
| 10 | 1 | 3.3 | 10 | 33 | 100 | 1,024 |
| 100 | 1 | 6.6 | 100 | 664 | 10,000 | ≈1.27×1030 |
| 1,000 | 1 | 10 | 1,000 | 10,000 | 1,000,000 | astronomically large |
| 1,000,000 | 1 | 20 | 1,000,000 | 2×107 | 1012 | astronomically large |
Notice that at n=1,000,000 a logarithmic algorithm needs about 20 operations while a quadratic one needs a trillion — a factor of fifty billion. This single comparison is why complexity analysis dominates serious software design.
It helps to attach a concrete mental image to each order:
A practical heuristic for exam questions: a single loop is usually O(n), a loop inside a loop is usually O(n2), a halving loop is O(logn), and "try every combination/ordering" is exponential or factorial. Verify the heuristic against the loop update steps, as the hidden-logarithm example later in this lesson shows.
Exam Tip: You must be able to state the Big-O complexity of every algorithm on the specification. Examiners frequently ask you to "state the time complexity" or to "explain why algorithm X is more efficient than algorithm Y for large datasets" — the answer is always rooted in the orders above, never in clock speed.
The reliable technique is to count how many times the innermost operation executes as a function of n, then apply the three simplification rules.
FUNCTION getFirst(arr)
RETURN arr[0]
END FUNCTION
However large the array, this performs exactly one indexing operation. No loop depends on n, so the order is O(1).
FUNCTION sumAll(arr, n)
total = 0
FOR i = 0 TO n - 1
total = total + arr[i]
END FOR
RETURN total
END FUNCTION
The loop body runs n times. The constant work outside the loop (initialising total, the final RETURN) is absorbed, giving O(n).
FUNCTION printPairs(arr, n)
FOR i = 0 TO n - 1
FOR j = 0 TO n - 1
OUTPUT arr[i], arr[j]
END FOR
END FOR
END FUNCTION
The outer loop runs n times; for each of those iterations the inner loop runs n times. Total executions of OUTPUT =n×n=n2, so the order is O(n2).
FUNCTION countHalvings(n)
count = 0
WHILE n > 1
n = n DIV 2
count = count + 1
END WHILE
RETURN count
END FUNCTION
Here n is halved each iteration rather than decremented. The number of halvings needed to reduce n to 1 is log2n, so the order is O(logn). This same halving behaviour is exactly what makes binary search logarithmic.
But beware: not every nested loop is quadratic. If an inner loop halves its range each pass you may get O(nlogn) rather than O(n2), and if two loops simply run one after another they add rather than multiply (O(n)+O(n)=O(n)). Always trace what each loop bound actually depends on.
| Case | Meaning | Example (Linear Search) |
|---|---|---|
| Best case | The fewest operations over any input of size n | Target is the first element: O(1) |
| Worst case | The most operations over any input of size n | Target is last or absent: O(n) |
| Average case | Expected operations over all equally likely inputs | About n/2 checks: O(n) |
Big-O conventionally refers to the worst case unless stated otherwise, because it gives a guarantee: the algorithm will never perform worse than the stated bound. The worst and average cases for linear search share the same order (O(n)) because dropping the constant turns n/2 into n — a reminder that Big-O classifies growth, not exact counts.
Exam Tip: When a question says "state the time complexity", give the worst-case Big-O unless told otherwise. If a question asks specifically for the best case, you must also describe the input that produces it — marks are awarded for the justification, not just the symbol.
Question: State the time complexity of the following pseudocode and justify your answer.
FOR i = 0 TO n - 1
FOR j = i + 1 TO n - 1
IF arr[j] < arr[i] THEN
SWAP arr[i], arr[j]
END IF
END FOR
END FOR
Answer: The inner loop does not run the full n times on every outer pass — its lower bound is i + 1. When i=0 it runs n−1 times, when i=1 it runs n−2 times, and so on down to 1. The total number of inner iterations is therefore
(n−1)+(n−2)+⋯+1=2n(n−1)=21n2−21n.
Dropping the constant factor 21 and the lower-order −21n term leaves O(n2). The key insight for full marks is that even though the inner loop shrinks each pass, the sum is still a quadratic in n.
Question: Determine the time complexity of the following and justify it.
i = 1
WHILE i < n
FOR j = 0 TO n - 1
OUTPUT i, j
END FOR
i = i * 2
END WHILE
Answer: The two loops are nested, but they do not both run n times. The inner FOR loop runs n times on each pass, costing O(n) per pass. The outer WHILE loop, however, doubles i each iteration (1,2,4,8,…), so it executes only log2n times before i reaches n. Multiplying the per-pass cost by the number of passes gives
O(n)×O(logn)=O(nlogn).
This is the trap the "two nested loops means O(n2)" rule of thumb falls into. The order depends not on how many loops there are but on how each loop variable changes: a counter that is multiplied contributes a logarithmic factor, while a counter that is incremented contributes a linear one. Always inspect the update step, not just the nesting depth.
You may meet Big-O written as a set membership, T(n)∈O(f(n)), or with an equals sign, T(n)=O(f(n)). Both are read "T(n) is order f(n)"; the set notation is technically more honest because O(f(n)) is really the set of all functions that grow no faster than f(n). At A-Level either form is accepted. What examiners penalise is sloppiness in the other direction — for example writing O(2n) or O(n2+3) in a final answer. Once you have determined the order you must present it in simplest form: O(n) and O(n2) respectively. Leaving constants or lower-order terms in a stated complexity suggests you have not understood that Big-O describes asymptotic growth, and it can cost the mark.
The most common higher-mark application of complexity is comparing two candidate algorithms for the same task. A disciplined comparison answers three questions in turn.
1. What is each algorithm's order? Suppose Algorithm A runs in TA(n)=100n steps and Algorithm B in TB(n)=2n2 steps. Their orders are O(n) and O(n2) respectively, so A is asymptotically superior.
2. Where do they cross over? Asymptotic superiority does not mean A is faster for every input. We can find the crossover by solving 100n=2n2, which gives n=50. For inputs smaller than 50 elements, B (the O(n2) algorithm) is actually faster despite its worse order, because its smaller constant factor wins while n is tiny. This is why real libraries sometimes switch to a "worse" algorithm for small inputs — a point revisited in the sorting lessons, where insertion sort beats merge sort below a few dozen elements.
3. What happens as n grows large? Beyond the crossover the order dominates utterly. At n=1,000, A needs 100,000 steps while B needs 2,000,000 — twenty times more. At n=1,000,000, A needs 108 steps while B needs 2×1012 — twenty thousand times more. The gap widens without limit, which is precisely what "asymptotically superior" means.
The exam-ready conclusion: for sufficiently large n, a lower-order algorithm always wins, regardless of constant factors. When a question asks which algorithm is "more efficient for large datasets," it is the order you must compare, and you should explicitly note that constant factors are irrelevant in that regime.
The specification cares deeply about one particular boundary in the table of complexity classes: the line between polynomial orders (O(1),O(logn),O(n),O(nlogn),O(n2),O(n3),…) and exponential or factorial orders (O(2n),O(n!),…). A problem for which a polynomial-time algorithm is known is called tractable; a problem for which only exponential-time algorithms are known is, for practical purposes, intractable.
The reason this boundary matters so much is that polynomial growth, however high the power, can always be "outrun" by faster hardware to some useful degree, whereas exponential growth cannot. Consider what happens when you buy a machine that is a thousand times faster:
| Order | Input solvable in 1 minute (old) | Input solvable in 1 minute (1000× faster) | Gain |
|---|---|---|---|
| O(n) | N | 1000N | 1000× larger |
| O(n2) | N | ≈31.6N | 1000× larger |
| O(2n) | N | N+log21000≈N+10 | only 10 more elements |
A thousandfold speed-up lets a linear algorithm handle a thousand times more data, and a quadratic one about thirty times more — meaningful gains. But for an exponential algorithm the same thousandfold speed-up buys you the ability to handle just ten extra elements. This is the formal sense in which exponential problems are hopeless at scale: no foreseeable hardware improvement rescues them, so the only way forward is a better algorithm (a lower order) or an acceptance of approximate answers via heuristics. This theme — tractability, and the use of heuristics when a problem is intractable — recurs throughout the theory-of-computation strand of the course, and the vocabulary you have built here is the foundation for all of it.
A program stores n customer records in an unsorted array.
(a) State, using Big-O notation, the worst-case time complexity of searching the array for a particular record. (1 mark)
(b) The programmer claims that doubling the processor speed will allow the program to handle datasets of any size. Evaluate this claim with reference to time complexity. (6 marks)
AO breakdown: (a) is AO1 (knowledge of the linear-search order). (b) is primarily AO3 (evaluation and reasoned judgement) drawing on AO2 (applying the meaning of Big-O to a real scenario).
(a) O(n).
(b) Doubling the processor speed makes the program run twice as fast, so it can handle bigger datasets in the same time. But if the dataset is very large it will still be slow, because searching takes longer the more records there are. So the claim is partly true but not completely right.
(a) O(n).
(b) The search is O(n), meaning the time grows in proportion to the number of records. Doubling the processor speed halves the constant time per operation, so a dataset that took 10 seconds now takes 5. However, this is only a constant-factor improvement: it does not change the order of growth. If the dataset itself doubles, the work doubles too, cancelling out the hardware gain. The claim that "any size" can be handled is therefore false, because no fixed speed-up can keep pace with unbounded growth in n.
(a) O(n).
(b) The linear search is O(n): worst-case running time T(n)≈cn for some constant c representing the time per comparison. Doubling processor speed replaces c with c/2, so T(n)≈(c/2)n — every input is handled in half the time, a purely constant-factor effect that leaves the order O(n) unchanged. The flaw in the claim is that the input size n is unbounded while the speed-up is a one-off constant. If the firm's customer base grows from one million to one billion records, the work increases a thousandfold; the doubled processor recovers only a factor of two of that, so the program still slows down by roughly 500 times. To handle arbitrarily large datasets the programmer must improve the algorithm's order — for example by keeping the array sorted and using binary search (O(logn)), reducing a billion-record search from about 109 comparisons to roughly 30. Hardware scales the constant; only a better order scales with the problem. The claim is therefore mistaken.
Examiner-style commentary: The mid-band response correctly states the order in (a) and grasps intuitively that speed-up "helps but isn't enough", yet it never distinguishes a constant factor from the order of growth, so it stays descriptive. The stronger response makes that distinction explicitly and reaches the correct verdict. The top-band response earns full marks by quantifying the argument (c→c/2), connecting it to the unbounded nature of n, and proposing an algorithmic remedy with its own order — demonstrating the AO3 evaluation the question targets.
A final technique worth practising is deriving complexity by writing down the exact operation count line by line, then simplifying. Consider this fragment that checks whether an array contains any duplicate value:
FUNCTION hasDuplicate(arr, n)
FOR i = 0 TO n - 1
FOR j = i + 1 TO n - 1
IF arr[i] = arr[j] THEN
RETURN TRUE
END IF
END FOR
END FOR
RETURN FALSE
END FUNCTION
In the worst case (no duplicates, so the function never returns early) the comparison arr[i] = arr[j] is the basic operation. For each value of i the inner loop runs n−1−i times, so the total number of comparisons is
∑i=0n−1(n−1−i)=(n−1)+(n−2)+⋯+1+0=2n(n−1).
This is 21n2−21n, which simplifies to O(n2). The best case, by contrast, occurs when the first two elements are equal: the function returns after a single comparison, giving O(1). Note how the same algorithm has wildly different best- and worst-case orders — a reminder always to state which case you are describing. The average case for random data is still O(n2), because on average the duplicate (if any) is found only after a constant fraction of the quadratic number of comparisons, and a constant fraction of n2 is still O(n2).
This counting approach — write the sum, evaluate it, simplify to the dominant term — is the most reliable method when a "rule of thumb" is ambiguous, and it is exactly what examiners reward in a "derive and justify the time complexity" question.
This content is aligned with the AQA A-Level Computer Science (7517) specification.