You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
This lesson covers the binary search tree (BST) — a binary tree that adds a single, powerful ordering rule, turning the unordered O(n) search of a plain binary tree into an O(logn) search while keeping the data permanently sorted. The BST is the structure behind ordered maps, sorted sets, and range queries, and it is examined through three skills: building a tree from a sequence, tracing a search, and deleting a node across its three cases. Its great weakness — degeneration into a linked list when fed sorted data — is equally examinable.
This lesson addresses AQA A-Level Computer Science (7517), section 4.2.7 (Trees / binary search trees) together with the traversal content of §4.3.2. It covers the BST ordering property, recursive search and insertion, the four traversals applied to a BST (the three depth-first orders plus breadth-first), the three-case deletion algorithm using the in-order successor, the O(logn)-versus-O(n) performance split driven by balance, and a comparison against sorted arrays and hash tables. It builds directly on §4.2.7 (binary trees) and §4.1.1 (recursion), and contrasts with §4.2.5 (hash tables) on the ordered-versus-unordered axis. All wording below is original; AQA's published specification text is not reproduced.
A binary search tree is a binary tree in which, for every node N:
This rule is recursive: it must hold at every node, not merely at the root. It is exactly that invariant which lets a search discard half the remaining tree at each step, because comparing the target with a node tells you which entire subtree it could possibly lie in.
flowchart TB
N8((8)) --> N3((3))
N8 --> N10((10))
N3 --> N1((1))
N3 --> N6((6))
N10 --> N14((14))
N6 --> N4((4))
N6 --> N7((7))
N14 --> N13((13))
Checking the property at the root (8): every left-subtree value (1, 3, 4, 6, 7) is less than 8, and every right-subtree value (10, 13, 14) is greater than 8 — and the same holds recursively at 3, 6, 10, and 14. (We assume distinct keys; duplicates are usually disallowed or sent consistently to one side.)
Search exploits the ordering directly:
FUNCTION search(node, target) RETURNS BOOLEAN
IF node = NULL THEN
RETURN FALSE
ELSE IF target = node.data THEN
RETURN TRUE
ELSE IF target < node.data THEN
RETURN search(node.left, target)
ELSE
RETURN search(node.right, target)
END IF
END FUNCTION
def search(node, target) -> bool:
if node is None:
return False
if target == node.data:
return True
elif target < node.data:
return search(node.left, target) # target can only be in the left subtree
else:
return search(node.right, target) # target can only be in the right subtree
Searching for 6 in the tree above:
| Step | At node | Comparison | Move |
|---|---|---|---|
| 1 | 8 | 6 < 8 | go left to 3 |
| 2 | 3 | 6 > 3 | go right to 6 |
| 3 | 6 | 6 = 6 | found |
Three comparisons locate the value in an eight-node tree — the path length equals the node's depth. Searching for an absent value, say 5, follows 8 → 3 → 6 → 4 → null and returns false at the null pointer. Each comparison halves the search space when the tree is balanced, which is the source of the O(logn) behaviour.
Insertion follows the same path a search would take, and places the new node at the null pointer where the search "falls off" the tree:
FUNCTION insert(node, value) RETURNS NODE
IF node = NULL THEN
RETURN NEW TreeNode(value) // found the empty slot
END IF
IF value < node.data THEN
node.left = insert(node.left, value)
ELSE IF value > node.data THEN
node.right = insert(node.right, value)
END IF
RETURN node
END FUNCTION
def insert(node, value):
if node is None:
return TreeNode(value)
if value < node.data:
node.left = insert(node.left, value)
elif value > node.data:
node.right = insert(node.right, value)
return node
Insert the sequence [8, 3, 10, 1, 6, 14, 4, 7, 13] into an empty tree. Each row gives the decision path taken from the root:
| Step | Value | Path from root | Placed as |
|---|---|---|---|
| 1 | 8 | — | root |
| 2 | 3 | 3 < 8 → left | left child of 8 |
| 3 | 10 | 10 > 8 → right | right child of 8 |
| 4 | 1 | 1 < 8 → left, 1 < 3 → left | left child of 3 |
| 5 | 6 | 6 < 8 → left, 6 > 3 → right | right child of 3 |
| 6 | 14 | 14 > 8 → right, 14 > 10 → right | right child of 10 |
| 7 | 4 | 4 < 8, 4 > 3, 4 < 6 | left child of 6 |
| 8 | 7 | 7 < 8, 7 > 3, 7 > 6 | right child of 6 |
| 9 | 13 | 13 > 8, 13 > 10, 13 < 14 | left child of 14 |
The completed tree is exactly the one drawn earlier. Insertion order determines tree shape: this sequence happened to produce a reasonably balanced tree, but inserting the same values in sorted order would produce a degenerate chain (see below). Being able to draw the tree after a given insertion sequence is a guaranteed exam skill.
All four traversals from the previous lesson apply, but on a BST the in-order traversal acquires special meaning. For the tree above:
| Traversal | Order rule | Output |
|---|---|---|
| Pre-order | Root, Left, Right | 8, 3, 1, 6, 4, 7, 10, 14, 13 |
| In-order | Left, Root, Right | 1, 3, 4, 6, 7, 8, 10, 13, 14 |
| Post-order | Left, Right, Root | 1, 4, 7, 6, 3, 13, 14, 10, 8 |
| Breadth-first | Level by level (queue) | 8, 3, 10, 1, 6, 14, 4, 7, 13 |
The in-order output is fully sorted. This is a direct consequence of the ordering property: visiting the left subtree (all smaller) before the node, and the right subtree (all larger) after it, yields ascending order at every level. This gives a sorting method ("tree sort": insert all values, then read them in order) and makes range queries efficient — to list every key between 5 and 12 you traverse in-order and emit only nodes in range, pruning subtrees that cannot contain matches.
def in_order(node):
if node is not None:
in_order(node.left) # all smaller values first
print(node.data, end=" ") # then this node
in_order(node.right) # then all larger values
Deletion is the most intricate operation because the tree must remain a valid BST afterwards. There are three cases, in increasing difficulty.
The node has no children: simply set its parent's pointer to null. Deleting 7 (a leaf) just removes it.
Replace the node with its single child (the subtree "promotes" upward). Deleting 14, whose only child is 13, makes 13 the right child of 10 — the ordering still holds because the whole promoted subtree was already on the correct side.
You cannot simply remove it without orphaning a subtree. Instead, replace its value with its in-order successor — the smallest value in its right subtree (equivalently, the in-order predecessor, the largest in the left subtree) — then delete that successor (which, being the leftmost node, has at most one child, reducing to Case 1 or 2). The successor is chosen because it is the next value in sorted order, so substituting it preserves the BST property.
Worked example — delete the root, 8 (two children). Its right subtree is rooted at 10; the smallest value there is the leftmost node, which is 10 itself (it has no left child). Copy 10 into the root, then delete the original 10, whose right child 14 promotes. The tree's in-order output afterwards is 1, 3, 4, 6, 7, 10, 13, 14 — still sorted, confirming the operation was correct.
def find_min(node):
current = node
while current.left is not None: # leftmost node = smallest value
current = current.left
return current
def delete(node, value):
if node is None:
return None
if value < node.data:
node.left = delete(node.left, value)
elif value > node.data:
node.right = delete(node.right, value)
else: # found the node to delete
if node.left is None: # Case 1/2: no left child
return node.right
elif node.right is None: # Case 2: no right child
return node.left
successor = find_min(node.right) # Case 3: in-order successor
node.data = successor.data # copy successor value up
node.right = delete(node.right, successor.data) # remove successor
return node
Exam Tip: Deletion is commonly examined. State which of the three cases applies, and for the two-child case always spell out the steps: find the in-order successor (smallest in the right subtree), copy its value into the node being deleted, then delete that successor node.
Take the BST below and delete 3, which has two children (1 and 6):
flowchart TB
subgraph Before["Before: delete 3 (two children)"]
A8((8)) --> A3((3))
A8 --> A10((10))
A3 --> A1((1))
A3 --> A6((6))
A6 --> A4((4))
A6 --> A7((7))
end
subgraph After["After: 4 (in-order successor) replaces 3"]
B8((8)) --> B4((4))
B8 --> B10((10))
B4 --> B1((1))
B4 --> B6((6))
B6 --> B7((7))
end
Step by step: 3 has two children, so find its in-order successor — the smallest value in its right subtree. The right subtree is rooted at 6; its leftmost node is 4 (4 has no left child). Copy 4 up into the node currently holding 3, then delete the original 4, which was a leaf (Case 1) and is simply removed. The result keeps the BST property: an in-order traversal afterwards gives 1, 4, 6, 7, 8, 10 — still ascending, confirming correctness. The successor 4 is chosen precisely because it is the next value after 3 in sorted order, so it is guaranteed to be larger than everything remaining on the left (just 1) and smaller than everything on the right (6, 7), making it a valid replacement.
Contrast the simpler cases: deleting the leaf 7 just nulls its parent's pointer; deleting a one-child node such as a hypothetical node whose only child is a subtree promotes that child directly. Being able to state the case and draw the before/after is exactly what the higher-mark deletion questions require.
| Operation | Average case (balanced) | Worst case (degenerate) |
|---|---|---|
| Search | O(logn) | O(n) |
| Insert | O(logn) | O(n) |
| Delete | O(logn) | O(n) |
| In-order traversal | O(n) | O(n) |
Every operation's cost equals the height of the tree. A balanced tree has height ≈log2n, so each comparison eliminates roughly half the remaining nodes — the tree-based equivalent of binary search. A degenerate (skewed) tree has height n−1, so operations degrade to a linear scan.
If values are inserted in already-sorted order, every new value is larger than all before it and is placed to the right, producing a one-sided chain:
flowchart TB
subgraph Balanced["Balanced — height ~ log n, O(log n)"]
B8((8)) --> B3((3))
B8 --> B10((10))
B3 --> B1((1))
B3 --> B6((6))
B10 --> B14((14))
end
subgraph Degenerate["Degenerate — height n-1, O(n)"]
D1((1)) --> D3((3))
D3 --> D6((6))
D6 --> D8((8))
end
This is the central caveat of the plain BST: its attractive O(logn) is only an average that assumes random-ish insertion order; adversarial or sorted input destroys it. The fix is a self-balancing tree.
Exam Tip: Whenever you quote BST performance, give both the average O(logn) and the worst O(n), and state the trigger for the worst case: inserting values in sorted (or reverse-sorted) order, which creates a degenerate tree.
| Type | Height | Effect on operations |
|---|---|---|
| Balanced | O(logn) | search/insert/delete all O(logn) |
| Degenerate (skewed) | O(n) | all operations degrade to O(n) |
Self-balancing BSTs — notably AVL trees and Red-Black trees — restore balance automatically after each insertion or deletion by performing rotations (local re-arrangements that reduce height without breaking the ordering). They guarantee O(logn) regardless of insertion order, at the cost of extra bookkeeping (a balance factor or colour bit per node). This guarantee is exactly why production ordered maps (such as C++'s std::map or Java's TreeMap) are built on balanced trees rather than plain BSTs.
| Feature | BST (balanced) | Sorted array | Hash table |
|---|---|---|---|
| Search | O(logn) | O(logn) (binary search) | O(1) average |
| Insert | O(logn) | O(n) (shift to keep sorted) | O(1) average |
| Delete | O(logn) | O(n) (shift to close gap) | O(1) average |
| Sorted traversal | O(n) in-order | O(n) iterate | O(nlogn) must sort first |
| Min / max / range / successor | Yes, efficient | Yes | No |
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.