Graphs

This lesson covers the graph — the most general data structure in the course, modelling arbitrary relationships between objects with no hierarchy and no restriction on connections. Where a tree forces one parent per node and no cycles, a graph allows any vertex to connect to any other, making it the natural model for road networks, social networks, the web, dependency systems, and state spaces. The examinable core is the terminology, the two ways to represent a graph in memory (adjacency matrix versus adjacency list) with their space/time trade-offs, and the link to the traversal and shortest-path algorithms of §4.3.

Spec Mapping

This lesson addresses AQA A-Level Computer Science (7517), section 4.2.8 (Graphs), and connects forward to §4.3.2 (graph traversal — BFS and DFS) and §4.3 (Dijkstra's shortest-path algorithm). It covers the formal definition $G = (V, E)$ , the directed/undirected and weighted/unweighted distinctions, the full vocabulary (vertex, edge, adjacent, degree, path, cycle, connected), the adjacency matrix and adjacency list representations with worked conversions, and an introduction to the breadth-first and depth-first traversals that the algorithms unit develops. It generalises §4.2.7 (a tree is a restricted graph) and reuses §4.2.3/§4.2.4 (the queue and stack that drive the traversals). All wording below is original; AQA's published specification text is not reproduced.

What is a Graph?

A graph is defined as a pair of sets:

$G = (V, E)$

where $V$ is a set of vertices (nodes) and $E$ is a set of edges, each edge joining a pair of vertices. That is the entire definition — and its generality is the point. Unlike a tree, a graph imposes no root, no parent–child hierarchy, and crucially permits cycles (paths that return to their start). A tree is in fact just a special graph: connected, acyclic, with a single distinguished root. Removing those restrictions gives the graph its power to model mutual, many-to-many relationships.

Key Terminology

Term	Definition
Vertex (node)	A single object in the graph (e.g. a city, a person, a web page).
Edge (arc)	A connection between two vertices (e.g. a road, a friendship, a hyperlink).
Adjacent	Two vertices are adjacent (neighbours) if an edge directly joins them.
Degree	The number of edges incident to a vertex. (In a directed graph: in-degree and out-degree.)
Path	A sequence of vertices connected by edges.
Cycle	A path that starts and ends at the same vertex.
Connected	A graph in which there is a path between every pair of vertices.
Weight	A numeric cost attached to an edge (distance, time, price).
Loop	An edge from a vertex to itself.

Types of Graph

Directed vs Undirected

Type	Description	Real-world example
Undirected	Edges have no direction; the relationship is mutual. If A–B exists, you can travel both ways.	Facebook friendships — if A is friends with B, B is friends with A.
Directed (digraph)	Edges have a direction; the relationship is one-way. A→B does not imply B→A.	Twitter/X "follows" — A following B does not mean B follows A.

flowchart LR
    subgraph Undirected
      UA((A)) --- UB((B))
      UA --- UC((C))
      UB --- UD((D))
      UC --- UD
    end
    subgraph Directed
      DA((A)) --> DB((B))
      DA --> DC((C))
      DC --> DD((D))
      DD --> DB
    end

Weighted vs Unweighted

Type	Description	Real-world example
Unweighted	All edges are equivalent; only connectivity matters.	"Are these two computers on the same network?"
Weighted	Each edge carries a numeric cost.	Road networks (distance), flight routes (price/time).

flowchart LR
    A((A)) -- "5" --- B((B))
    A -- "3" --- C((C))
    B -- "2" --- D((D))
    C -- "7" --- D

These two axes are independent, giving four combinations; a road map, for instance, is typically undirected and weighted, while a task-dependency chart is directed and unweighted. Identifying which kind a scenario calls for is a common opening to an exam question.

Representing a Graph in Memory

A graph diagram is for humans; a program needs a concrete representation. The two standard choices are the adjacency matrix and the adjacency list, and AQA expects fluency in both, including converting a diagram to each.

1. Adjacency Matrix

An adjacency matrix is a $V \times V$ 2D array (§4.2.1) in which cell $[i][j]$ records the edge from vertex $i$ to vertex $j$ : 1 (or the weight) if an edge exists, 0 if it does not.

Vertices: A(0), B(1), C(2), D(3) Edges: A–B, A–C, B–D, C–D (undirected, unweighted):

	A	B	C	D
A	0	1	1	0
B	1	0	0	1
C	1	0	0	1
D	0	1	1	0

Two structural facts to state in exams: for an undirected graph the matrix is symmetric about the main diagonal (because edge A–B implies B–A, so $[i][j] = [j][i]$ ); for a weighted graph the cells hold the weights instead of 1:

	A	B	C	D
A	0	5	3	0
B	5	0	0	2
C	3	0	0	7
D	0	2	7	0

For a directed graph the matrix is generally not symmetric — a row represents out-edges and a column in-edges.

# Adjacency matrix for the undirected weighted graph above
vertices = ['A', 'B', 'C', 'D']
matrix = [
    [0, 5, 3, 0],   # A's row: edges A-B (5), A-C (3)
    [5, 0, 0, 2],   # B
    [3, 0, 0, 7],   # C
    [0, 2, 7, 0],   # D
]

# Edge existence / weight between A (0) and B (1) is an O(1) lookup
if matrix[0][1] != 0:
    print(f"Edge A-B with weight {matrix[0][1]}")   # Edge A-B with weight 5

2. Adjacency List

An adjacency list stores, for each vertex, a list of just its neighbours (with weights if any). Vertices with few edges store little; absent edges occupy no space at all.

Vertex	Neighbours (vertex, weight)
A	(B, 5), (C, 3)
B	(A, 5), (D, 2)
C	(A, 3), (D, 7)
D	(B, 2), (C, 7)

# Adjacency list for the same undirected weighted graph
graph = {
    'A': [('B', 5), ('C', 3)],
    'B': [('A', 5), ('D', 2)],
    'C': [('A', 3), ('D', 7)],
    'D': [('B', 2), ('C', 7)],
}

for neighbour, weight in graph['A']:        # list A's neighbours
    print(f"A -> {neighbour} (weight {weight})")

Matrix vs List: The Central Trade-off

Feature	Adjacency matrix	Adjacency list
Space	$O(V^2)$ — every cell, edge or not	$O(V + E)$ — only real edges
Edge exists $u$ – $v$ ?	$O(1)$ — direct cell lookup	$O(\text{degree}(u))$ — scan $u$ 's list
Find all neighbours of $u$	$O(V)$ — scan the whole row	$O(\text{degree}(u))$ — read the list
Add an edge	$O(1)$	$O(1)$
Best for	Dense graphs (edges $\approx V^2$ )	Sparse graphs (edges $\ll V^2$ )

The decision rests on density. A dense graph (most possible edges present) wastes little in a matrix and gains $O(1)$ edge tests, so the matrix wins. A sparse graph — the common real-world case, e.g. a social network where each person knows only hundreds of millions — would waste vast memory as a matrix ( $O(V^2)$ for a near-empty grid), so the list's $O(V + E)$ is dramatically better. Most large practical graphs are sparse, which is why adjacency lists dominate in practice.

Worked space comparison. A social network of $V = 10{,}000$ users where each has on average 50 friends has $E \approx 250{,}000$ undirected edges. A matrix needs $10{,}000^2 = 10^8$ cells; a list needs $\approx 10{,}000 + 500{,}000 = 510{,}000$ entries — roughly 200 times less memory. This concrete contrast is exactly what a "justify your choice of representation" question rewards.

Exam Tip: You must be able to draw both representations for a given graph and justify a choice. The headline trade-off: matrices give $O(1)$ edge look-up and suit dense graphs; lists give $O(V+E)$ space and suit sparse graphs. Note the matrix of an undirected graph is symmetric.

Graph Traversals (link to §4.3.2)

Visiting every reachable vertex requires a traversal, and there are two — but with a twist not present in trees. Because a graph can contain cycles, a traversal must record which vertices it has already seen in a visited set; otherwise it would loop forever. This visited-set is the single most important difference between graph traversal and the tree traversals of §4.2.7.

Breadth-First Search (BFS)

BFS explores outward in rings: all vertices one edge away, then two edges away, and so on. It uses a queue (FIFO), so the earliest-discovered vertices are expanded first.

from collections import deque

def bfs(graph, start):
    visited = {start}                 # mark start as seen (avoids revisiting in cycles)
    queue = deque([start])
    result = []
    while queue:
        vertex = queue.popleft()      # FIFO: explore in discovery order
        result.append(vertex)
        for neighbour, _ in graph[vertex]:
            if neighbour not in visited:
                visited.add(neighbour)
                queue.append(neighbour)
    return result

Tracing BFS from A on the example graph: visit A, enqueue B, C; visit B, enqueue D; visit C (D already seen); visit D. Order: A, B, C, D. Because BFS expands by distance, it finds the shortest path in an unweighted graph (fewest edges) — a key examinable property.

Depth-First Search (DFS)

DFS plunges as deep as possible along one branch before backtracking. It uses a stack (LIFO), or equivalently recursion via the call stack.

def dfs(graph, start):
    visited = set()
    stack = [start]
    result = []
    while stack:
        vertex = stack.pop()          # LIFO: dive into the most recent vertex
        if vertex not in visited:
            visited.add(vertex)
            result.append(vertex)
            for neighbour, _ in graph[vertex]:
                if neighbour not in visited:
                    stack.append(neighbour)
    return result

Traced BFS and DFS, Showing the Frontier

The two traversals are clearest when you write out the queue/stack at each step. Use this small graph with adjacency list A:[B,C], B:[A,D,E], C:[A,F], D:[B], E:[B,F], F:[C,E], starting from A (neighbours processed in listed order):

flowchart LR
    A((A)) --- B((B))
    A --- C((C))
    B --- D((D))
    B --- E((E))
    C --- F((F))
    E --- F

BFS (queue, front on the left):

Step	Dequeue & visit	Enqueue (unseen neighbours)	Queue after
1	A	B, C	[B, C]
2	B	D, E	[C, D, E]
3	C	F	[D, E, F]
4	D	—	[E, F]
5	E	(F already seen)	[F]
6	F	—	[]

BFS visit order: A, B, C, D, E, F — strictly by distance from A (A at 0; B, C at 1; D, E, F at 2).

DFS (stack, top on the right):

Step	Pop & visit	Push (unseen neighbours)	Stack after
1	A	B, C	[B, C]
2	C	F	[B, F]
3	F	E	[B, E]
4	E	B already pushed	[B, B]
5	B	D	[B, D]
6	D	—	[B]

DFS visit order: A, C, F, E, B, D — diving deep down one branch (A→C→F→E) before unwinding. The two orders differ entirely despite the same graph, purely because BFS's queue serves the oldest frontier vertex while DFS's stack serves the newest. Both still require the visited set to avoid looping on the cycle B–E–F–C–A.

Worked Conversion: Matrix ↔ List

A common exam task is converting between the two representations. Given the adjacency matrix on the left (vertices A,B,C,D), read each row's non-zero columns to build the list on the right:

Matrix row	Non-zero columns	Adjacency-list entry
A: 0 1 1 0	B, C	A → B, C
B: 1 0 0 1	A, D	B → A, D
C: 1 0 0 1	A, D	C → A, D
D: 0 1 1 0	B, C	D → B, C

Going the other way, build a $V \times V$ grid of zeros and set $[i][j]=1$ for every neighbour $j$ in vertex $i$ 's list. Note the matrix's symmetry confirms the graph is undirected, and counting the 1s and halving (8/2 = 4) gives the number of undirected edges — both quick checks worth stating.

BFS vs DFS

Feature	BFS	DFS
Structure used	Queue (FIFO)	Stack / recursion (LIFO)
Exploration order	Ring by ring (by distance)	Branch by branch (deep first)
Shortest path (unweighted)?	Yes — fewest edges	No
Memory	Can be high (stores a whole frontier)	Lower (stores the current path)
Typical uses	Shortest unweighted path, "degrees of separation"	Cycle detection, topological sort, maze/path finding

The queue-versus-stack distinction is the crux: swapping a BFS's queue for a stack turns it into a DFS. Both run in $O(V + E)$ on an adjacency list.

Beyond Unweighted: Dijkstra (link to §4.3)

BFS finds the path with the fewest edges, but on a weighted graph the cheapest route may take more edges (a short motorway detour beating a direct slow road). Finding the minimum-weight path is the job of Dijkstra's algorithm, studied in §4.3: it generalises BFS by always expanding the closest-so-far unvisited vertex (using a priority queue) rather than the earliest-discovered. You are not expected to run Dijkstra here, but you should recognise that it is the weighted counterpart of BFS and that the graph representations above are exactly what it operates on.

Graphs

Graphs

Spec Mapping

What is a Graph?

Key Terminology

Types of Graph

Directed vs Undirected

Weighted vs Unweighted

Representing a Graph in Memory

1. Adjacency Matrix

2. Adjacency List

Matrix vs List: The Central Trade-off

Graph Traversals (link to §4.3.2)

Breadth-First Search (BFS)

Depth-First Search (DFS)

Traced BFS and DFS, Showing the Frontier

Worked Conversion: Matrix ↔ List

BFS vs DFS

Beyond Unweighted: Dijkstra (link to §4.3)

Applications of Graphs

More in Computer Science