You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
This lesson covers the graph — the most general data structure in the course, modelling arbitrary relationships between objects with no hierarchy and no restriction on connections. Where a tree forces one parent per node and no cycles, a graph allows any vertex to connect to any other, making it the natural model for road networks, social networks, the web, dependency systems, and state spaces. The examinable core is the terminology, the two ways to represent a graph in memory (adjacency matrix versus adjacency list) with their space/time trade-offs, and the link to the traversal and shortest-path algorithms of §4.3.
This lesson addresses AQA A-Level Computer Science (7517), section 4.2.8 (Graphs), and connects forward to §4.3.2 (graph traversal — BFS and DFS) and §4.3 (Dijkstra's shortest-path algorithm). It covers the formal definition G=(V,E), the directed/undirected and weighted/unweighted distinctions, the full vocabulary (vertex, edge, adjacent, degree, path, cycle, connected), the adjacency matrix and adjacency list representations with worked conversions, and an introduction to the breadth-first and depth-first traversals that the algorithms unit develops. It generalises §4.2.7 (a tree is a restricted graph) and reuses §4.2.3/§4.2.4 (the queue and stack that drive the traversals). All wording below is original; AQA's published specification text is not reproduced.
A graph is defined as a pair of sets:
G=(V,E)
where V is a set of vertices (nodes) and E is a set of edges, each edge joining a pair of vertices. That is the entire definition — and its generality is the point. Unlike a tree, a graph imposes no root, no parent–child hierarchy, and crucially permits cycles (paths that return to their start). A tree is in fact just a special graph: connected, acyclic, with a single distinguished root. Removing those restrictions gives the graph its power to model mutual, many-to-many relationships.
| Term | Definition |
|---|---|
| Vertex (node) | A single object in the graph (e.g. a city, a person, a web page). |
| Edge (arc) | A connection between two vertices (e.g. a road, a friendship, a hyperlink). |
| Adjacent | Two vertices are adjacent (neighbours) if an edge directly joins them. |
| Degree | The number of edges incident to a vertex. (In a directed graph: in-degree and out-degree.) |
| Path | A sequence of vertices connected by edges. |
| Cycle | A path that starts and ends at the same vertex. |
| Connected | A graph in which there is a path between every pair of vertices. |
| Weight | A numeric cost attached to an edge (distance, time, price). |
| Loop | An edge from a vertex to itself. |
| Type | Description | Real-world example |
|---|---|---|
| Undirected | Edges have no direction; the relationship is mutual. If A–B exists, you can travel both ways. | Facebook friendships — if A is friends with B, B is friends with A. |
| Directed (digraph) | Edges have a direction; the relationship is one-way. A→B does not imply B→A. | Twitter/X "follows" — A following B does not mean B follows A. |
flowchart LR
subgraph Undirected
UA((A)) --- UB((B))
UA --- UC((C))
UB --- UD((D))
UC --- UD
end
subgraph Directed
DA((A)) --> DB((B))
DA --> DC((C))
DC --> DD((D))
DD --> DB
end
| Type | Description | Real-world example |
|---|---|---|
| Unweighted | All edges are equivalent; only connectivity matters. | "Are these two computers on the same network?" |
| Weighted | Each edge carries a numeric cost. | Road networks (distance), flight routes (price/time). |
flowchart LR
A((A)) -- "5" --- B((B))
A -- "3" --- C((C))
B -- "2" --- D((D))
C -- "7" --- D
These two axes are independent, giving four combinations; a road map, for instance, is typically undirected and weighted, while a task-dependency chart is directed and unweighted. Identifying which kind a scenario calls for is a common opening to an exam question.
A graph diagram is for humans; a program needs a concrete representation. The two standard choices are the adjacency matrix and the adjacency list, and AQA expects fluency in both, including converting a diagram to each.
An adjacency matrix is a V×V 2D array (§4.2.1) in which cell [i][j] records the edge from vertex i to vertex j: 1 (or the weight) if an edge exists, 0 if it does not.
Vertices: A(0), B(1), C(2), D(3) Edges: A–B, A–C, B–D, C–D (undirected, unweighted):
| A | B | C | D | |
|---|---|---|---|---|
| A | 0 | 1 | 1 | 0 |
| B | 1 | 0 | 0 | 1 |
| C | 1 | 0 | 0 | 1 |
| D | 0 | 1 | 1 | 0 |
Two structural facts to state in exams: for an undirected graph the matrix is symmetric about the main diagonal (because edge A–B implies B–A, so [i][j]=[j][i]); for a weighted graph the cells hold the weights instead of 1:
| A | B | C | D | |
|---|---|---|---|---|
| A | 0 | 5 | 3 | 0 |
| B | 5 | 0 | 0 | 2 |
| C | 3 | 0 | 0 | 7 |
| D | 0 | 2 | 7 | 0 |
For a directed graph the matrix is generally not symmetric — a row represents out-edges and a column in-edges.
# Adjacency matrix for the undirected weighted graph above
vertices = ['A', 'B', 'C', 'D']
matrix = [
[0, 5, 3, 0], # A's row: edges A-B (5), A-C (3)
[5, 0, 0, 2], # B
[3, 0, 0, 7], # C
[0, 2, 7, 0], # D
]
# Edge existence / weight between A (0) and B (1) is an O(1) lookup
if matrix[0][1] != 0:
print(f"Edge A-B with weight {matrix[0][1]}") # Edge A-B with weight 5
An adjacency list stores, for each vertex, a list of just its neighbours (with weights if any). Vertices with few edges store little; absent edges occupy no space at all.
| Vertex | Neighbours (vertex, weight) |
|---|---|
| A | (B, 5), (C, 3) |
| B | (A, 5), (D, 2) |
| C | (A, 3), (D, 7) |
| D | (B, 2), (C, 7) |
# Adjacency list for the same undirected weighted graph
graph = {
'A': [('B', 5), ('C', 3)],
'B': [('A', 5), ('D', 2)],
'C': [('A', 3), ('D', 7)],
'D': [('B', 2), ('C', 7)],
}
for neighbour, weight in graph['A']: # list A's neighbours
print(f"A -> {neighbour} (weight {weight})")
| Feature | Adjacency matrix | Adjacency list |
|---|---|---|
| Space | O(V2) — every cell, edge or not | O(V+E) — only real edges |
| Edge exists u–v? | O(1) — direct cell lookup | O(degree(u)) — scan u's list |
| Find all neighbours of u | O(V) — scan the whole row | O(degree(u)) — read the list |
| Add an edge | O(1) | O(1) |
| Best for | Dense graphs (edges ≈V2) | Sparse graphs (edges ≪V2) |
The decision rests on density. A dense graph (most possible edges present) wastes little in a matrix and gains O(1) edge tests, so the matrix wins. A sparse graph — the common real-world case, e.g. a social network where each person knows only hundreds of millions — would waste vast memory as a matrix (O(V2) for a near-empty grid), so the list's O(V+E) is dramatically better. Most large practical graphs are sparse, which is why adjacency lists dominate in practice.
Worked space comparison. A social network of V=10,000 users where each has on average 50 friends has E≈250,000 undirected edges. A matrix needs 10,0002=108 cells; a list needs ≈10,000+500,000=510,000 entries — roughly 200 times less memory. This concrete contrast is exactly what a "justify your choice of representation" question rewards.
Exam Tip: You must be able to draw both representations for a given graph and justify a choice. The headline trade-off: matrices give O(1) edge look-up and suit dense graphs; lists give O(V+E) space and suit sparse graphs. Note the matrix of an undirected graph is symmetric.
Visiting every reachable vertex requires a traversal, and there are two — but with a twist not present in trees. Because a graph can contain cycles, a traversal must record which vertices it has already seen in a visited set; otherwise it would loop forever. This visited-set is the single most important difference between graph traversal and the tree traversals of §4.2.7.
BFS explores outward in rings: all vertices one edge away, then two edges away, and so on. It uses a queue (FIFO), so the earliest-discovered vertices are expanded first.
from collections import deque
def bfs(graph, start):
visited = {start} # mark start as seen (avoids revisiting in cycles)
queue = deque([start])
result = []
while queue:
vertex = queue.popleft() # FIFO: explore in discovery order
result.append(vertex)
for neighbour, _ in graph[vertex]:
if neighbour not in visited:
visited.add(neighbour)
queue.append(neighbour)
return result
Tracing BFS from A on the example graph: visit A, enqueue B, C; visit B, enqueue D; visit C (D already seen); visit D. Order: A, B, C, D. Because BFS expands by distance, it finds the shortest path in an unweighted graph (fewest edges) — a key examinable property.
DFS plunges as deep as possible along one branch before backtracking. It uses a stack (LIFO), or equivalently recursion via the call stack.
def dfs(graph, start):
visited = set()
stack = [start]
result = []
while stack:
vertex = stack.pop() # LIFO: dive into the most recent vertex
if vertex not in visited:
visited.add(vertex)
result.append(vertex)
for neighbour, _ in graph[vertex]:
if neighbour not in visited:
stack.append(neighbour)
return result
The two traversals are clearest when you write out the queue/stack at each step. Use this small graph with adjacency list A:[B,C], B:[A,D,E], C:[A,F], D:[B], E:[B,F], F:[C,E], starting from A (neighbours processed in listed order):
flowchart LR
A((A)) --- B((B))
A --- C((C))
B --- D((D))
B --- E((E))
C --- F((F))
E --- F
BFS (queue, front on the left):
| Step | Dequeue & visit | Enqueue (unseen neighbours) | Queue after |
|---|---|---|---|
| 1 | A | B, C | [B, C] |
| 2 | B | D, E | [C, D, E] |
| 3 | C | F | [D, E, F] |
| 4 | D | — | [E, F] |
| 5 | E | (F already seen) | [F] |
| 6 | F | — | [] |
BFS visit order: A, B, C, D, E, F — strictly by distance from A (A at 0; B, C at 1; D, E, F at 2).
DFS (stack, top on the right):
| Step | Pop & visit | Push (unseen neighbours) | Stack after |
|---|---|---|---|
| 1 | A | B, C | [B, C] |
| 2 | C | F | [B, F] |
| 3 | F | E | [B, E] |
| 4 | E | B already pushed | [B, B] |
| 5 | B | D | [B, D] |
| 6 | D | — | [B] |
DFS visit order: A, C, F, E, B, D — diving deep down one branch (A→C→F→E) before unwinding. The two orders differ entirely despite the same graph, purely because BFS's queue serves the oldest frontier vertex while DFS's stack serves the newest. Both still require the visited set to avoid looping on the cycle B–E–F–C–A.
A common exam task is converting between the two representations. Given the adjacency matrix on the left (vertices A,B,C,D), read each row's non-zero columns to build the list on the right:
| Matrix row | Non-zero columns | Adjacency-list entry |
|---|---|---|
| A: 0 1 1 0 | B, C | A → B, C |
| B: 1 0 0 1 | A, D | B → A, D |
| C: 1 0 0 1 | A, D | C → A, D |
| D: 0 1 1 0 | B, C | D → B, C |
Going the other way, build a V×V grid of zeros and set [i][j]=1 for every neighbour j in vertex i's list. Note the matrix's symmetry confirms the graph is undirected, and counting the 1s and halving (8/2 = 4) gives the number of undirected edges — both quick checks worth stating.
| Feature | BFS | DFS |
|---|---|---|
| Structure used | Queue (FIFO) | Stack / recursion (LIFO) |
| Exploration order | Ring by ring (by distance) | Branch by branch (deep first) |
| Shortest path (unweighted)? | Yes — fewest edges | No |
| Memory | Can be high (stores a whole frontier) | Lower (stores the current path) |
| Typical uses | Shortest unweighted path, "degrees of separation" | Cycle detection, topological sort, maze/path finding |
The queue-versus-stack distinction is the crux: swapping a BFS's queue for a stack turns it into a DFS. Both run in O(V+E) on an adjacency list.
BFS finds the path with the fewest edges, but on a weighted graph the cheapest route may take more edges (a short motorway detour beating a direct slow road). Finding the minimum-weight path is the job of Dijkstra's algorithm, studied in §4.3: it generalises BFS by always expanding the closest-so-far unvisited vertex (using a priority queue) rather than the earliest-discovered. You are not expected to run Dijkstra here, but you should recognise that it is the weighted counterpart of BFS and that the graph representations above are exactly what it operates on.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.