You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
A graph is a data structure consisting of vertices (nodes) connected by edges. Graphs model an enormous range of real-world systems: road networks, social networks, the hyperlink structure of the web, and computer networks. Graph traversal means visiting every reachable vertex in a systematic way. At A-Level you must understand two traversal algorithms — Breadth-First Search (BFS) and Depth-First Search (DFS). This lesson covers BFS, which fans out level by level using a queue, and as a direct consequence finds the shortest path in unweighted graphs.
This lesson addresses AQA A-Level Computer Science (7517), Section 4.3 Fundamentals of algorithms, specifically 4.3.1 Graph-traversal algorithms, where you must understand and be able to trace breadth-first search. The graph itself is the data structure from 4.2.5 (graphs), and BFS's defining mechanism — a first-in-first-out queue — is the abstract data type from 4.2.3 (queues). The complexity analysis O(V+E) draws on 4.4.4 (classification of algorithms).
A little vocabulary makes the rest of the lesson precise. A vertex (plural vertices), also called a node, is a single entity in the graph; an edge is a connection between two vertices. In an undirected graph an edge has no direction (a friendship: if A knows B, B knows A), whereas in a directed graph (or digraph) an edge points one way (a one-way street, or "A follows B" on social media). The degree of a vertex is the number of edges incident to it. A path is a sequence of vertices each connected to the next by an edge, and a graph is connected if there is a path between every pair of vertices. A weighted graph attaches a number (a cost, distance or time) to each edge; an unweighted graph does not, which is equivalent to every edge having weight 1. BFS, the subject of this lesson, operates on unweighted graphs and treats every edge as a single unit step — which is exactly why it counts number of edges rather than total distance.
Before traversing a graph you must represent it in memory. Two standard representations are used, and the choice affects both the memory used and the cost of the traversal.
A 2D array where entry [i][j]=1 if there is an edge from vertex i to vertex j, and 0 otherwise. For the example graph with edges A–B, A–C, B–D and C–D:
| A | B | C | D | |
|---|---|---|---|---|
| A | 0 | 1 | 1 | 0 |
| B | 1 | 0 | 0 | 1 |
| C | 1 | 0 | 0 | 1 |
| D | 0 | 1 | 1 | 0 |
The matrix is symmetric for an undirected graph, because edge A–B is recorded as both [A][B] and [B][A].
Each vertex stores a list of its neighbours:
| Vertex | Adjacent vertices |
|---|---|
| A | B, C |
| B | A, D |
| C | A, D |
| D | B, C |
| Feature | Adjacency Matrix | Adjacency List |
|---|---|---|
| Space | O(V2) | O(V+E) |
| Check if a specific edge exists | O(1) | O(degree) |
| List all neighbours of a vertex | O(V) | O(degree) |
| Best for | Dense graphs (many edges) | Sparse graphs (few edges) |
Most real graphs (road and social networks) are sparse — far fewer edges than the V2 a matrix reserves — so adjacency lists usually win on memory, which is why BFS is normally analysed over an adjacency-list representation.
The representation also affects BFS's running time directly. With an adjacency list, finding all neighbours of a vertex takes time proportional to that vertex's degree, and summing degrees over all vertices gives 2E (each edge is counted at both its endpoints), so the neighbour-scanning across the whole traversal totals O(E) and BFS runs in O(V+E). With an adjacency matrix, finding a vertex's neighbours requires scanning an entire row of V entries regardless of how many neighbours actually exist, so the traversal costs O(V2) — much worse on a sparse graph where E≪V2. This is a concrete instance of how a data-structure choice changes an algorithm's complexity, and it is why "O(V+E)" implicitly assumes an adjacency-list representation. If a question specifies an adjacency matrix, the honest answer for the traversal cost is O(V2).
BFS explores a graph level by level. Starting from a source vertex it first visits every vertex at distance 1, then every vertex at distance 2, and so on, never moving to the next level until the current one is exhausted. The structure that enforces this order is a queue (FIFO): vertices are processed in exactly the order they were discovered.
A helpful image is a ripple spreading outward from a stone dropped in a pond. The source is where the stone lands; the ripple reaches all the points one unit away simultaneously, then all the points two units away, and so on. BFS is that ripple made discrete: it forms a "frontier" of vertices at the current distance, processes the whole frontier, and in doing so discovers the next frontier one step further out. The queue is what keeps the frontiers in order — because vertices are added to the back and removed from the front, a vertex two steps away can never be processed before a vertex one step away that was discovered earlier. This is the entire secret of the algorithm, and it is the reason BFS, and not DFS, finds shortest paths.
The single most important rule is to mark a vertex visited the moment it is enqueued, not when it is later dequeued. Marking on dequeue allows the same vertex to be enqueued several times by different neighbours before it is ever processed.
PROCEDURE BFS(graph, start)
CREATE queue
CREATE visited set
ENQUEUE start
ADD start TO visited
WHILE queue IS NOT EMPTY
v = DEQUEUE
OUTPUT v
FOR EACH neighbour OF v IN graph
IF neighbour NOT IN visited THEN
ADD neighbour TO visited
ENQUEUE neighbour
END IF
END FOR
END WHILE
END PROCEDURE
A Python implementation using a deque as the queue:
from collections import deque
def bfs(graph, start):
visited = {start}
queue = deque([start])
order = []
while queue:
v = queue.popleft() # dequeue from the front (FIFO)
order.append(v)
for neighbour in graph[v]:
if neighbour not in visited:
visited.add(neighbour) # mark on enqueue
queue.append(neighbour) # enqueue at the back
return order
Consider this graph:
graph LR
A["A"] --- B["B"]
B --- E["E"]
A --- C["C"]
C --- D["D"]
D --- E
Adjacency list (neighbours considered in the listed order):
| Vertex | Adjacent vertices |
|---|---|
| A | B, C |
| B | A, E, D |
| C | A, D |
| D | B, C, E |
| E | B, D |
BFS starting from A, showing the queue contents (front on the left) and the visited set after each step:
| Step | Action | Dequeued | Newly enqueued | Queue after | Visited set |
|---|---|---|---|---|---|
| 0 | Enqueue start A | — | A | [A] | {A} |
| 1 | Process A; neighbours B, C unvisited | A | B, C | [B, C] | {A, B, C} |
| 2 | Process B; neighbours E, D unvisited (A seen) | B | E, D | [C, E, D] | {A, B, C, E, D} |
| 3 | Process C; neighbours A, D already visited | C | — | [E, D] | {A, B, C, E, D} |
| 4 | Process E; neighbours B, D already visited | E | — | [D] | {A, B, C, E, D} |
| 5 | Process D; all neighbours visited | D | — | [] | {A, B, C, E, D} |
BFS visit order: A, B, C, E, D.
Notice how the order tracks distance from the start: A is at distance 0; B and C (enqueued by A) are at distance 1; E and D (enqueued by B) are at distance 2. The FIFO queue guarantees all of level 1 is processed before any of level 2 — which is precisely why BFS finds shortest paths.
To consolidate the method, run BFS from vertex 1 on this larger graph:
graph LR
N1["1"] --- N2["2"]
N1 --- N3["3"]
N2 --- N4["4"]
N3 --- N4
N3 --- N5["5"]
N4 --- N6["6"]
N5 --- N6
Adjacency list:
| Vertex | Adjacent vertices |
|---|---|
| 1 | 2, 3 |
| 2 | 1, 4 |
| 3 | 1, 4, 5 |
| 4 | 2, 3, 6 |
| 5 | 3, 6 |
| 6 | 4, 5 |
BFS from 1:
| Step | Dequeued | Newly enqueued | Queue after | Visited set |
|---|---|---|---|---|
| 0 | — | 1 | [1] | {1} |
| 1 | 1 | 2, 3 | [2, 3] | {1, 2, 3} |
| 2 | 2 | 4 | [3, 4] | {1, 2, 3, 4} |
| 3 | 3 | 5 (4 already queued) | [4, 5] | {1, 2, 3, 4, 5} |
| 4 | 4 | 6 | [5, 6] | {1, 2, 3, 4, 5, 6} |
| 5 | 5 | — (3, 6 seen) | [6] | {1, 2, 3, 4, 5, 6} |
| 6 | 6 | — | [] | {1, 2, 3, 4, 5, 6} |
Visit order: 1, 2, 3, 4, 5, 6. The distances from vertex 1 are: vertex 1 at distance 0; vertices 2 and 3 at distance 1; vertices 4 and 5 at distance 2; vertex 6 at distance 3 (via 1→2→4→6 or 1→3→5→6, both length 3). Step 3 illustrates the importance of marking on enqueue: when vertex 3 is processed it sees neighbour 4, but 4 was already enqueued by vertex 2 in step 2, so it is correctly skipped rather than added a second time.
| Property | Detail |
|---|---|
| Data structure | Queue (FIFO) |
| Time complexity | O(V+E) |
| Space complexity | O(V) |
| Finds shortest path? | Yes — in unweighted graphs |
| Complete? | Yes — finds a solution if one exists |
Each vertex is enqueued and dequeued exactly once (the visited set prevents re-enqueueing), contributing O(V). Whenever a vertex is processed, all of its incident edges are examined; summed over the whole run, every edge is examined once in a directed graph (twice in an undirected one, once from each endpoint), contributing O(E). The total is therefore O(V+E). The space is O(V) because the queue and visited set together hold at most all V vertices.
We write the complexity as O(V+E) rather than collapsing it to a single variable because V and E can vary independently. In a sparse graph where E is roughly proportional to V (such as a road network, where each junction connects to only a handful of others), the cost is effectively O(V) — linear in the number of vertices. In a dense graph where E approaches V2 (every vertex connected to every other), the E term dominates and the cost is closer to O(V2). Keeping both terms in the expression makes BFS's behaviour honest across the whole range of graph densities, which is why examiners expect the two-variable form.
| Application | Why BFS fits |
|---|---|
| Shortest path (unweighted) | The first time BFS reaches a vertex is via a path of minimum edge count |
| Web crawlers | Crawl pages outward level by level from a seed URL |
| Social network analysis | Find everyone within k "hops" (e.g. friends-of-friends) of a user |
| Connected components | A single BFS reveals every vertex reachable from the start |
| Peer-to-peer / broadcast networks | Flood a message to nearest nodes first |
These applications all exploit the same underlying property — that BFS discovers vertices in order of distance — but it is worth seeing how:
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.