Travelling Salesman Problem

The Travelling Salesman Problem (TSP) asks for the cheapest round trip: starting from a base, visit every vertex exactly once and return to the start, minimising the total weight. A delivery driver covering every depot, a robot drill visiting every hole on a circuit board, a tourist planning a single loop through every city — all reduce to this one abstract question. It is the partner of the route inspection problem: route inspection traverses every edge (and is easy), whereas the TSP visits every vertex (and is famously hard). This lesson develops the exam machinery — the nearest-neighbour upper bound and the delete-a-vertex lower bound — and shows how squeezing the two bounds together can pin down, and sometimes prove, the optimal tour.

1. Where this sits in AQA 7367

This is a headline optimisation topic of the Paper 3 Discrete option (7367/3D) (Paper 3: 2 h, 100 marks, 33⅓%; AO1 40% / AO2 25% / AO3 35%). Computing a nearest-neighbour tour or an MST-based lower bound is AO1 (carry out a routine); choosing a good starting vertex, interpreting the bound gap, and deciding whether optimality is proven are AO2/AO3. Because Paper 3 is the most problem-solving-weighted paper, TSP questions reward candidates who state bounds precisely and reason about what they do and do not establish.

2. Core theory: the classical and practical problems

A tour (Hamiltonian cycle) of a network on $n$ vertices is a closed walk that visits every vertex exactly once. Its weight is the sum of the $n$ edge weights traversed:

$w(\text{tour}) = \sum_{i=1}^{n} w(v_i\,v_{i+1}), \qquad v_{n+1} = v_1.$

We distinguish two versions:

Classical TSP — the network is complete (every pair joined) and weights satisfy the triangle inequality $w(XZ) \le w(XY) + w(YZ)$ . Every Hamiltonian cycle is then a legal tour.
Practical TSP — the network may be incomplete or violate the triangle inequality (a direct road may be longer than a detour). We first replace each weight by the shortest-path distance (a complete table of least distances, e.g. via Dijkstra), turning a practical problem into a classical one, then quote the route in the original network.

A complete network on $n$ vertices has

$\frac{(n-1)!}{2}$

distinct tours (fix the start; $(n-1)!$ orderings of the rest; halve for the two directions). This count explodes:

$n$	Number of tours $\tfrac{(n-1)!}{2}$
4	$3$
5	$12$
6	$60$
10	$181{,}440$
20	$\approx 6 \times 10^{16}$

No known algorithm solves the general TSP in polynomial time — it is NP-hard — so for all but the smallest networks we abandon exhaustive search and bracket the optimum between a computable upper and lower bound.

From practical to classical: the table of least distances

The reason we insist on the triangle inequality is subtle but important. In a real road network a direct link X–Z may be longer than a detour X–Y–Z (a motorway loop versus a country shortcut). If we ran nearest neighbour on the raw weights, a "move to the nearest vertex" might cost more than reaching it via a third vertex, and the resulting figure would not be a meaningful tour length. The fix is to first build the complete table of least distances: replace every entry $w(XZ)$ by the shortest-path distance $d(XZ)$ computed (for instance) by Dijkstra. By construction these satisfy $d(XZ) \le d(XY) + d(YZ)$ , so the triangle inequality holds and the classical methods apply. After solving, each leg of the optimal tour is then "expanded" back into its underlying shortest route in the original network — so a reported tour such as A–C may secretly travel A–B–C on the ground. Examiners flag this with the phrase "complete network of shortest distances"; treat any such table as already pre-processed.

Two consequences are worth stating cleanly:

$d(XX) = 0, \qquad d(XZ) = d(ZX) \ (\text{symmetric}), \qquad d(XZ) \le d(XY) + d(YZ).$

These three properties make $d$ a metric, which is exactly what the approximation guarantees of §12 rely on.

3. The nearest-neighbour upper bound (with M1/A1 mark scheme)

Any actual tour is, by definition, at least as long as the optimum, so any tour gives an upper bound. The nearest-neighbour (NN) algorithm builds a cheap-ish tour greedily:

Start at a chosen vertex; mark it visited.
Go to the nearest unvisited vertex; mark it visited.
Repeat until all vertices are visited.
Return directly to the start.

Worked Example 3.1 — nearest neighbour from several starts

Distance matrix for towns A, B, C, D:

	A	B	C	D
A	–	5	8	12
B	5	–	6	9
C	8	6	–	3
D	12	9	3	–

Find nearest-neighbour tours from A and from B, and hence the best upper bound.

From A: A $\to$ B (nearest, $5$ ) $\to$ C (nearest unvisited, $6$ ) $\to$ D ( $3$ ) $\to$ A ( $12$ ). $w = 5 + 6 + 3 + 12 = 26.$ (M1 for a correct greedy choice at each step; A1 for the total $26$ .)

From B: B $\to$ A ( $5$ ) $\to$ C ( $8$ , since AC $=8 <$ AD $=12$ ) $\to$ D ( $3$ ) $\to$ B ( $9$ ). $w = 5 + 8 + 3 + 9 = 25.$ (A1 for $25$ .)

The smaller of the two is the best upper bound so far: $25$ (tour B–A–C–D–B, equivalently A–C–D–B–A). The optimum satisfies $L^{*} \le 25$ .

graph LR
  A((A)) ---|5| B((B))
  B ---|6| C((C))
  C ---|3| D((D))
  D ---|12| A

Exam tip. NN is greedy and short-sighted — its last edge back to the start is often expensive, as the $12$ above shows. Always try every allowed starting vertex and quote the smallest total. Quote it as "upper bound $=25$ from nearest neighbour starting at B".

A nearest-neighbour tour can often be improved by a simple local search called 2-opt: pick two non-adjacent edges of the current tour, delete them, and reconnect the two resulting paths the other way round. If the swap shortens the tour, keep it; repeat until no swap helps. For example, a tour containing two long "crossing" edges can usually be un-crossed for a saving — geometrically, an optimal tour in the plane never crosses itself. While 2-opt is not part of the routine examined procedure, knowing it explains why the nearest-neighbour figure is only an upper bound and how the gap to the optimum is closed in practice. The key exam discipline remains unchanged: report the best tour you have found as the upper bound, and never assume it is optimal until a lower bound confirms it.

4. The delete-a-vertex lower bound (with M1/A1 mark scheme)

A lower bound is a value the optimum cannot drop below. The standard construction removes one vertex and rebuilds a minimum connection:

Delete a vertex $V$ and all its incident edges.
Find a minimum spanning tree (MST) of the remaining $n-1$ vertices (weight $w_{\text{MST}}$ ).
Add back the two shortest edges from $V$ to the rest.
Lower bound $= w_{\text{MST}} + (\text{two shortest edges at } V)$ .

Why it is a valid lower bound. Any tour, with $V$ deleted, becomes a Hamiltonian path on the remaining vertices — a spanning tree of them — so it costs at least $w_{\text{MST}}$ ; and the tour uses exactly two edges at $V$ , each at least the smallest available, so at least the two shortest. Summing gives a quantity no tour can beat.

Worked Example 4.1 — lower bound by deleting a vertex

Using the same A, B, C, D matrix, compute the lower bound from deleting A, and from deleting C; hence give the tightest bound.

Delete A. Remaining B, C, D with edges BC $=6$ , BD $=9$ , CD $=3$ . MST: CD $(3)$ , BC $(6)$ $\Rightarrow w_{\text{MST}} = 9$ . Two shortest at A: AB $(5)$ , AC $(8)$ . $\text{LB}_A = 9 + 5 + 8 = 22.$ (M1 for MST of the reduced graph; A1 for $22$ .)

Delete C. Remaining A, B, D with edges AB $=5$ , AD $=12$ , BD $=9$ . MST: AB $(5)$ , BD $(9)$ $\Rightarrow w_{\text{MST}} = 14$ . Two shortest at C: CD $(3)$ , CB $(6)$ . $\text{LB}_C = 14 + 3 + 6 = 23.$ (A1 for $23$ .)

Deleting C gives the larger — hence better — lower bound, $23$ . Combining with §3, $23 \le L^{*} \le 25.$

Exam tip. A larger lower bound is a better lower bound, so try deleting each vertex and keep the maximum. Marks are lost when candidates report the first bound they compute rather than the best.

5. Squeezing the bounds: a full five-town walkthrough

When the upper and lower bounds coincide, the optimum is proven. Here is a complete network where they meet.

Five towns with distances (all distinct):

	A	B	C	D	E
A	–	9	12	8	11
B	9	–	7	10	6
C	12	7	–	13	5
D	8	10	13	–	4
E	11	6	5	4	–

Upper bound — nearest neighbour from A. A $\to$ D ( $8$ ) $\to$ E ( $4$ ) $\to$ C ( $5$ ) $\to$ B ( $7$ ) $\to$ A ( $9$ ): $w = 8 + 4 + 5 + 7 + 9 = 33. \quad (\text{UB} = 33,\ \text{tour A–D–E–C–B–A}.)$

Lower bound — delete E. Remaining A, B, C, D with edges AB $=9$ , AC $=12$ , AD $=8$ , BC $=7$ , BD $=10$ , CD $=13$ . Sort: BC $(7)$ , AD $(8)$ , AB $(9)$ , BD $(10)$ , AC $(12)$ , CD $(13)$ . Kruskal adds BC, AD, AB (the next two would close cycles): $w_{\text{MST}} = 7 + 8 + 9 = 24.$ Two shortest at E: ED $(4)$ , EC $(5)$ , sum $9$ . Hence $\text{LB}_E = 24 + 9 = 33.$

The bounds meet at $33$ , so the optimum is exactly $L^{*} = 33$ , achieved by the tour A–D–E–C–B–A (equivalently A–B–C–E–D–A). (For contrast, deleting A gives only $\text{LB}_A = w_{\text{MST}} + 8 + 9 = 15 + 17 = 32$ , one short — which is exactly why you try each deletion. The remainder-MST there is DE $(4)$ , CE $(5)$ , BE $(6)$ $= 15$ .)

Because this network has only $12$ distinct tours, we can also verify the optimum directly — and it is reassuring to see the bounds were honest:

Tour from A	Weight
A–B–C–E–D–A	$9+7+5+4+8 = 33$ (optimal)
A–C–B–E–D–A	$12+7+6+4+8 = 37$
A–B–E–C–D–A	$9+6+5+13+8 = 41$
A–C–E–B–D–A	$12+5+6+10+8 = 41$
A–D–B–C–E–A	$8+10+7+5+11 = 41$
A–B–D–E–C–A	$9+10+4+5+12 = 40$

The smallest is $33$ , confirming the bound-squeeze. In an exam you would not be expected to enumerate — the meeting of the bounds is the proof — but the table shows why the method is trustworthy.

graph LR
  A((A)) ---|8| D((D))
  D ---|4| E((E))
  E ---|5| C((C))
  C ---|7| B((B))
  B ---|9| A

Travelling Salesman Problem

Travelling Salesman Problem

1. Where this sits in AQA 7367

2. Core theory: the classical and practical problems

From practical to classical: the table of least distances

3. The nearest-neighbour upper bound (with M1/A1 mark scheme)

Worked Example 3.1 — nearest neighbour from several starts

4. The delete-a-vertex lower bound (with M1/A1 mark scheme)

Worked Example 4.1 — lower bound by deleting a vertex

5. Squeezing the bounds: a full five-town walkthrough

6. Specimen-style exam question

More in Mathematics