You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
A graph is the most general structure in this course. A tree, met earlier, is a restricted graph — connected, hierarchical, with no cycles and exactly one path between any two nodes. Remove those restrictions and you have a graph: any vertex may connect to any other, connections may be one-way or two-way, they may carry costs, and paths may loop. This generality is exactly why graphs model so much of the real world — friendships, road maps, web links, dependencies between tasks. This lesson is about the structure: the terminology that describes a graph, and the two standard ways to represent one in memory — the adjacency matrix and the adjacency list — with a careful comparison of their space and time trade-offs. The famous traversal and shortest-path algorithms (breadth-first, depth-first, Dijkstra's, A*) live in the algorithms course; this lesson builds the structure they all walk over and cross-references them rather than re-deriving them.
This lesson addresses the H446 1.4.2 Data Structures content on graphs:
(Phrasing here paraphrases the specification content; it is not a verbatim quote.)
Formally, a graph G is a pair G=(V,E), where V is a set of vertices (also called nodes) and E is a set of edges (also called arcs), each edge joining a pair of vertices. That is the whole definition — there is no root, no required hierarchy, and no ban on cycles. The contrast with a tree is the clearest way to fix the idea:
| Property | Tree | Graph |
|---|---|---|
| Root | exactly one | none |
| Cycles | never | allowed |
| Paths between two nodes | exactly one | zero, one or many |
| Edge direction | parent → child only | directed or undirected |
| Term | Definition |
|---|---|
| Vertex (node) | A data point in the graph. |
| Edge (arc) | A connection between two vertices. |
| Adjacent | Two vertices joined directly by an edge are adjacent (neighbours). |
| Degree | The number of edges incident on a vertex. |
| Path | A sequence of vertices each joined to the next by an edge. |
| Cycle | A path that starts and ends at the same vertex. |
| Connected graph | A path exists between every pair of vertices. |
| Complete graph | Every vertex is joined to every other vertex. |
| Sparse graph | Few edges relative to the number possible ($ |
| Dense graph | Many edges, approaching the maximum ($ |
The sparse/dense distinction is not just terminology — it is the single fact that decides which representation to choose, so it is worth fixing now. A complete graph on n vertices has 2n(n−1) undirected edges, which grows like n2; a sparse graph (a social network, a road map) has far fewer, typically growing only linearly with n.
Most of these terms can be read directly from a diagram, and exam questions routinely ask you to do so. Consider this small undirected graph:
flowchart LR
P((P)) --- Q((Q))
P --- R((R))
Q --- R((R))
R --- S((S))
Being able to extract degree, paths, cycles and connectivity from a diagram — and to spot that a single cycle disqualifies a structure from being a tree — is assumed by almost every graph question.
In an undirected graph an edge joins two vertices symmetrically — if A is joined to B then B is joined to A (friendship: if A is friends with B, B is friends with A). In a directed graph (a digraph) each edge has a direction, a start and an end — A → B does not imply B → A (web links: page A may link to B without B linking back). In a digraph each vertex has an in-degree (edges arriving) and an out-degree (edges leaving).
flowchart LR
subgraph Undirected
UA((A)) --- UB((B))
UA --- UC((C))
UB --- UD((D))
UC --- UD
end
subgraph Directed
DA((A)) --> DB((B))
DA --> DC((C))
DC --> DD((D))
DD --> DB
end
An unweighted graph records only whether two vertices are connected; a weighted graph attaches a numeric weight (cost, distance, time, capacity) to each edge. Weights are what turn a graph into a model you can optimise over — the shortest route on a road map, the cheapest set of flights — and they are the reason Dijkstra's and A* (algorithms course) exist.
flowchart LR
A((A)) -- "5" --- B((B))
A -- "3" --- C((C))
B -- "2" --- D((D))
C -- "7" --- D
Choosing the right kind of graph for a scenario is itself examinable: friendships are undirected and unweighted; "who follows whom" on social media is directed and unweighted; a road network is undirected and weighted (distance); flight routes are directed and weighted (cost or time).
A graph diagram is for humans; a program needs the same information as data. The two standard representations are the adjacency matrix and the adjacency list, and almost every graph exam question turns on being able to produce both and compare them.
An adjacency matrix is a ∣V∣×∣V∣ two-dimensional array where cell [i][j] records the edge from vertex i to vertex j:
Vertices A(0), B(1), C(2), D(3); edges A–B, A–C, B–D, C–D:
| A | B | C | D | |
|---|---|---|---|---|
| A | 0 | 1 | 1 | 0 |
| B | 1 | 0 | 0 | 1 |
| C | 1 | 0 | 0 | 1 |
| D | 0 | 1 | 1 | 0 |
For an undirected graph the matrix is always symmetric — [i][j]=[j][i] — because each edge is recorded twice (A–B appears at both [A][B] and [B][A]). The leading diagonal is all zeros unless the graph permits self-loops.
The same four vertices with edge weights A–B = 5, A–C = 3, B–D = 2, C–D = 7:
| A | B | C | D | |
|---|---|---|---|---|
| A | 0 | 5 | 3 | 0 |
| B | 5 | 0 | 0 | 2 |
| C | 3 | 0 | 0 | 7 |
| D | 0 | 2 | 7 | 0 |
For a directed graph the matrix is not generally symmetric: row i holds the out-edges of vertex i. Edges A → B, A → C, C → D, D → B:
| A | B | C | D | |
|---|---|---|---|---|
| A | 0 | 1 | 1 | 0 |
| B | 0 | 0 | 0 | 0 |
| C | 0 | 0 | 0 | 1 |
| D | 0 | 1 | 0 | 0 |
Reading the matrix: row B is all zeros (B has no out-edges), and the asymmetry between [A][B]=1 and [B][A]=0 records that A → B is one-way.
# Adjacency matrix for the weighted undirected graph above
vertices = ['A', 'B', 'C', 'D']
matrix = [
[0, 5, 3, 0], # A
[5, 0, 0, 2], # B
[3, 0, 0, 7], # C
[0, 2, 7, 0], # D
]
# Edge lookup is O(1) — a single array access
if matrix[0][1] != 0:
print("Edge A-B exists, weight", matrix[0][1])
# Finding A's neighbours scans the whole row — O(V)
for j in range(len(vertices)):
if matrix[0][j] != 0:
print("A ->", vertices[j], "weight", matrix[0][j])
An adjacency list stores, for each vertex, a list of just its neighbours (with weights if weighted). Empty rows of the matrix simply do not appear, which is where the space saving on sparse graphs comes from.
| Vertex | Neighbours |
|---|---|
| A | B, C |
| B | A, D |
| C | A, D |
| D | B, C |
| Vertex | (neighbour, weight) |
|---|---|
| A | (B, 5), (C, 3) |
| B | (A, 5), (D, 2) |
| C | (A, 3), (D, 7) |
| D | (B, 2), (C, 7) |
# Adjacency list as a dictionary of lists
graph = {
'A': [('B', 5), ('C', 3)],
'B': [('A', 5), ('D', 2)],
'C': [('A', 3), ('D', 7)],
'D': [('B', 2), ('C', 7)],
}
# Iterating A's neighbours is O(degree) — touches only real edges
for neighbour, weight in graph['A']:
print("A ->", neighbour, "weight", weight)
# Edge lookup must scan A's list — O(degree), not O(1)
def has_edge(g, u, v):
return any(n == v for n, _ in g[u])
For a directed graph the adjacency list stores only each vertex's out-edges, which makes the asymmetry explicit. Taking the earlier digraph (A → B, A → C, C → D, D → B):
| Vertex | Out-neighbours |
|---|---|
| A | B, C |
| B | (none) |
| C | D |
| D | B |
Reading off the degrees here is instructive: B's list is empty, so its out-degree is 0, yet B appears as a target in two other lists (A's and D's), so its in-degree is 2. Unlike the undirected case, a directed edge is stored exactly once — in the start vertex's list only — which is precisely why the corresponding matrix is not symmetric. This single-versus-double storage of an edge is the structural fact behind every directed/undirected difference in both representations, and it is a favourite discriminator in exam mark schemes.
This comparison is the heart of the topic. Let V=∣V∣ be the number of vertices and E=∣E∣ the number of edges.
| Operation / property | Adjacency matrix | Adjacency list |
|---|---|---|
| Space | O(V2) | O(V+E) |
| Check if a specific edge exists | O(1) — one array access | O(degv) — scan the list |
| Find all neighbours of a vertex | O(V) — scan the whole row | O(degv) — scan only real edges |
| Add an edge | O(1) | O(1) |
| Add a vertex | O(V2) — rebuild a bigger matrix | O(1) — add an empty list |
| Best for | dense graphs | sparse graphs |
The whole trade-off rests on one number: the matrix always uses O(V2) space regardless of how many edges exist, because every cell exists whether or not it holds an edge. For a dense graph that is unavoidable anyway and the O(1) edge lookup is a real win. For a sparse graph it is wasteful: a social network of a million users where each has a few hundred friends would need a 1012-cell matrix that is almost entirely zeros, whereas an adjacency list stores only the few hundred million real edges, O(V+E).
Suppose V=1000. A matrix needs V2=106 cells whatever the edge count. An adjacency list needs space proportional to V+E:
The crossover is exactly the sparse/dense boundary: lists win on memory until the graph is dense enough that E approaches V2, at which point the matrix's constant-time lookup wins.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.