You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
A great many problems in computing are optimisation problems: they ask not merely for a solution but for the best one — the shortest route, the lowest cost, the maximum profit, the tightest packing. For small inputs we can simply examine every possibility and pick the best. But for many important optimisation problems the number of possibilities explodes so violently with input size that exhaustive search becomes physically impossible — not slow, but impossible, on any conceivable hardware, for inputs of quite modest size. Such problems are intractable. When faced with one, we abandon the demand for a provably optimal answer and instead use a heuristic: a strategy that finds a good enough solution in a reasonable time. This lesson explains the dividing line between tractable and intractable problems, why brute force fails on the wrong side of that line, and the main families of heuristic methods used to cope — with the travelling salesman problem as the motivating example throughout.
This lesson addresses AQA A-Level Computer Science (7517), Section 4.4 Theory of computation, specifically 4.4.4 Classification of algorithms (tractable vs intractable problems, and the use of heuristic methods when exact solutions are infeasible). It also connects to 4.3.6 Optimisation algorithms, since Dijkstra's and A* are optimisation algorithms and A*'s heuristic is a worked instance of the heuristic idea. It depends on the Big-O complexity-class vocabulary of 4.4.4 (polynomial vs exponential vs factorial) and links to the limits-of-computation material (computable vs non-computable problems) in the same section.
An optimisation problem is defined by three ingredients:
The TSP is the textbook optimisation problem and recurs throughout this lesson:
The trouble is the size of that search space. For n=20 cities there are roughly 20!≈2.4×1018 orderings. A computer checking one billion (109) tours per second would need
1092.4×1018=2.4×109 seconds≈76 years
to examine them all — for just twenty cities. Add a handful more and the time exceeds the age of the universe. This is the face of intractability.
The specification draws careful distinctions between four classifications. Keeping them separate is a frequent source of marks.
| Classification | Definition | Example |
|---|---|---|
| Tractable | A problem for which an algorithm exists that runs in polynomial time — O(nk) for some constant k | Sorting, O(nlogn); searching, O(logn) |
| Intractable | A problem that is solvable in principle, but for which no polynomial-time algorithm is known — the best known are exponential or factorial | TSP, O(n!) by brute force |
| Computable | A problem that can be solved by an algorithm at all (it may take impractically long, but it terminates with the right answer) | TSP is computable — brute force does terminate, eventually |
| Non-computable | A problem that no algorithm can ever solve, however much time is allowed | The Halting Problem |
Two relationships are worth stressing. First, intractable problems are still computable — the TSP can be solved exactly, just not quickly; intractability is about time, not possibility. Non-computability is the stronger barrier: the Halting Problem cannot be solved by any algorithm, no matter how slow. Second, tractability is about the existence of a polynomial-time algorithm, not about ease: a problem is formally tractable even if its best algorithm is O(n100), which would be useless in practice. The classification is a theoretical line, and you should be ready to say so.
Exam Tip: "Tractable" does not mean "easy" or "fast" — it means "solvable in polynomial time". O(n100) is technically tractable yet hopelessly impractical, while O(2n) is intractable yet fine for n=10. In practice we treat O(n2) or O(n3) as "reasonable" and exponential/factorial as "infeasible", but the definition is the polynomial boundary.
The reason brute force collapses is the chasm between polynomial and exponential/factorial growth. The table below shows the number of basic operations each order demands as n grows:
| n | n2 (polynomial) | 2n (exponential) | n! (factorial) |
|---|---|---|---|
| 10 | 100 | 1,024 | 3,628,800 |
| 20 | 400 | ≈1.05×106 | ≈2.4×1018 |
| 50 | 2,500 | ≈1.1×1015 | ≈3×1064 |
| 100 | 10,000 | ≈1.3×1030 | ≈9.3×10157 |
At n=100 the polynomial column is a comfortable ten thousand, while the factorial column (≈10157) exceeds the estimated number of atoms in the observable universe (≈1080) by seventy-seven orders of magnitude. No amount of faster hardware rescues this: recall from the complexity lesson that a thousand-fold speed-up lets a linear algorithm handle a thousand times more data but lets an exponential algorithm handle only about ten more elements. When a problem's only known algorithms are exponential or factorial and n is more than tiny, brute force is not slow — it is impossible. This impossibility is the entire motivation for heuristics.
A heuristic is a problem-solving strategy that deliberately trades the guarantee of optimality for speed: it produces a good-enough solution in a reasonable (polynomial) time, accepting that the answer may not be the absolute best.
| Property | Detail |
|---|---|
| Not guaranteed optimal | The solution is usually close to the best, but there is no proof it is the best |
| Much faster | Typically runs in polynomial time (O(n2), say) instead of exponential/factorial |
| Problem-specific | A heuristic encodes domain knowledge; a good TSP heuristic is useless for, say, timetabling |
| A genuine trade-off | Solution quality is exchanged for computation time, and the balance can often be tuned |
A heuristic is, in essence, an educated "rule of thumb". The art lies in choosing a rule that is cheap to apply yet usually lands close to optimal — and in being honest that "usually" and "close" are doing real work, because a heuristic can occasionally produce a poor answer.
A greedy algorithm builds a solution by repeatedly making the choice that looks best right now (the locally optimal choice), never reconsidering. It is the simplest heuristic strategy and often surprisingly effective, though it can be led astray because a locally optimal choice need not lead to a globally optimal result.
Nearest-neighbour heuristic for the TSP:
This runs in O(n2) — for each of n steps it scans the remaining cities to find the closest — which is dramatically faster than the O(n!) brute force, but it gives no guarantee of the shortest tour.
Worked example — where greedy goes wrong. Consider four cities with these symmetric distances:
graph LR
A["A"] ---|1| B["B"]
B ---|1| C["C"]
C ---|1| D["D"]
A ---|2| C["C"]
B ---|2| D["D"]
A ---|9| D["D"]
Distances: A–B = 1, B–C = 1, C–D = 1, A–C = 2, B–D = 2, A–D = 9.
Greedy (nearest neighbour) from A: the nearest city to A is B (1). From B, the unvisited nearest is C (1). From C, the only unvisited city is D (1). Then return D→A (9). Tour A→B→C→D→A has length 1+1+1+9=12.
A better tour: A→C→D→B→A has length 2+1+2+1=6 — half the greedy tour's length. The greedy choice of B→C→D marched the route into the corner at D, from which the only way home was the expensive 9-length edge. This concretely demonstrates the central weakness of greedy methods: an irrevocable locally optimal choice (always step to the nearest city) can force a globally poor outcome. The heuristic is fast and often good, but here it is twice the optimal — exactly the kind of failure that motivates the more sophisticated methods below.
Inspired by the metallurgical process of slowly cooling a metal so its atoms settle into a low-energy crystal:
The crucial idea is that occasionally accepting a worse solution — especially early, when the temperature is high — lets the search climb out of a local optimum (a solution better than all its near neighbours but not the global best). A purely greedy search gets permanently stuck in the first local optimum it reaches; simulated annealing's controlled randomness gives it a route out, gradually settling as the temperature falls.
Inspired by biological evolution and natural selection:
Over successive generations the population tends to evolve towards better solutions, because fitter candidates are more likely to pass their features on. Genetic algorithms suit very large, rugged search spaces and parallelise naturally (many candidates can be evaluated at once).
You have already met a heuristic at work: the function h(n) in A* search estimates the remaining cost to the goal. A* is a useful bridge between this lesson and the optimisation algorithms, because here the heuristic accelerates an exact search rather than approximating an answer — and its quality directly governs performance:
| Heuristic quality | Effect on A* |
|---|---|
| h(n)=0 | A* degenerates to Dijkstra's — no guidance, explores in all directions |
| h(n) = exact remaining cost | A* walks straight to the goal — perfect guidance, no wasted exploration |
| h(n) slightly underestimates (admissible) | A* is fast and still finds the optimal path |
| h(n) overestimates (inadmissible) | A* is faster still but may miss the optimal path |
The pattern mirrors the general heuristic trade-off: a more aggressive (tighter) estimate speeds the search, but pushing it past the point of admissibility sacrifices the guarantee of an optimal answer — quality traded for speed.
| Approach | Guarantees optimal? | Time | When to use |
|---|---|---|---|
| Brute force | Yes | O(n!) or O(2n) | Only when n is very small |
| Greedy heuristic | No | O(n2) or similar | A quick approximation is needed and occasional poor answers are tolerable |
| Simulated annealing | No (but often near-optimal) | Depends on the cooling schedule | Rugged landscapes with many local optima |
| Genetic algorithm | No (but often near-optimal) | Depends on population size and generations | Very large search spaces; parallelisable |
The single thread running through the table is the quality-versus-time trade-off: the only method guaranteeing optimality (brute force) is the only one that is infeasible at scale, and every practical method buys its speed by relaxing the optimality guarantee.
A courier firm must plan a daily round that visits all of its drop-off points and returns to the depot, minimising total distance.
(a) The firm has 15 drop-off points. Explain why finding the guaranteed shortest round by brute force is impractical. (3 marks)
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.