You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
A binary tree is the first non-linear structure in this course: instead of a single line of elements, data branches out hierarchically, each node leading to up to two children. This lesson concentrates on what a binary tree is, the vocabulary that describes its shape, and how a binary search tree organises data — the structure. The algorithms that search, insert into, delete from and traverse a tree are developed in the next lesson (tree-operations) and in the algorithms course; here we build the foundation they all rest on, and cross-reference them rather than duplicating their derivations.
This lesson addresses the H446 1.4.2 Data Structures content on trees:
(Phrasing here paraphrases the specification content; it is not a verbatim quote.)
A tree is a non-linear, hierarchical data structure of nodes joined by edges. Unlike the linear structures so far — arrays and linked lists lay data in a single sequence — a tree branches, so one node can lead to several others. This branching is what lets trees model naturally hierarchical information (a file system, an organisation chart, the grammar of a sentence) and what gives the best tree algorithms their characteristic O(logn) efficiency, because following one branch discards all the others.
Formally a tree is a set of nodes with exactly one root and no cycles: every non-root node has precisely one parent, and there is exactly one path from the root to any node. That "one parent, no cycles" rule is what distinguishes a tree from the more general graph (covered later), where a node may have many predecessors and paths may loop.
| Term | Definition |
|---|---|
| Node | An element of the tree holding data (and links to children). |
| Root | The single topmost node; it has no parent. |
| Edge | A link joining a parent to one of its children. |
| Parent | A node with one or more children. |
| Child | A node directly below another (its parent). |
| Leaf | A node with no children (also called a terminal/external node). |
| Internal node | A node with at least one child (non-leaf). |
| Subtree | Any node together with all of its descendants. |
| Sibling | Nodes sharing the same parent. |
| Depth (of a node) | Number of edges from the root down to that node (root depth 0). |
| Height (of the tree) | Number of edges on the longest root-to-leaf path. |
| Level | All nodes at the same depth (root is level 0). |
The pair that causes the most confusion is depth and height. Depth is measured downwards from the root to a particular node; height is the longest such path in the whole tree (or from a given node down to its deepest leaf). The root's depth is always 0; the tree's height equals the depth of its deepest leaf. Being precise about edges-versus-nodes here matters — this course counts edges, so a single-node tree has height 0.
The vocabulary is deliberately borrowed from family trees and botany, and the metaphors are worth leaning on because they make the relationships intuitive. A parent sits one level above its children; siblings share a parent; leaf nodes are the "tips" of the tree with nothing below them; and a subtree is any node viewed as a root in its own right, together with everything hanging beneath it. This last idea — that every node is the root of its own subtree — is the single most useful way to think about trees, because it is what makes them naturally recursive: a binary tree is either empty, or a root node with a left subtree and a right subtree, each of which is itself a binary tree. That recursive definition is exactly why the operations on trees are so cleanly expressed with recursion, and why the call stack (from the stacks lesson) is the structure quietly powering them. Notice, too, that one node can be described several ways at once: in the example BST below, the node 3 is simultaneously a child of 8, the parent of 1 and 6, a sibling of 10, and the root of the subtree {3, 1, 6, 4, 7}. Fluency with this vocabulary is assumed by every later tree question, so it is worth over-learning.
A binary tree is a tree in which every node has at most two children, distinguished as the left child and the right child. "At most two" means a node may have zero, one or two children, and crucially the left/right distinction is significant — a node with only a right child is a different tree from one with only a left child, even though both have one child.
flowchart TD
Root --> Left
Root --> Right
Left --> LL["left-left"]
Left --> LR["left-right"]
Right --> RR["right-right"]
A tree node is simply a record with two child pointers — the same record-plus-pointer idea from the linked-lists lesson, but with two links instead of one:
record TreeNode
data : integer
left : TreeNode // pointer to the left child, or null
right : TreeNode // pointer to the right child, or null
endrecord
class TreeNode:
def __init__(self, data):
self.data = data
self.left = None # null until a left child is attached
self.right = None # null until a right child is attached
Seeing the node as "a linked-list node with a second pointer" is the key conceptual bridge: a linked list is a degenerate tree where every node has only a right child, and a tree is a linked list that is allowed to branch. Both are dynamic structures built from heap-allocated nodes joined by pointers, so a tree grows and shrinks at runtime exactly as a linked list does. The same warnings about pointers therefore apply: an empty tree is represented by a null root, every operation must handle that empty case, and the leaves are exactly the nodes whose left and right pointers are both null. Keeping the null-pointer discipline from the linked-lists lesson in mind avoids the most common run-time errors when working with trees.
The same set of values can form very differently shaped binary trees, and the shape determines efficiency. A few standard descriptions:
| Shape | Meaning |
|---|---|
| Balanced | The left and right subtrees of every node differ in height by at most a small constant, so the tree is "bushy" and short. |
| Complete | Every level is full except possibly the last, which is filled left-to-right. |
| Full | Every node has either 0 or 2 children (never exactly 1). |
| Degenerate (skewed) | Each node has only one child, so the tree collapses into a single line — effectively a linked list. |
This matters enormously: a balanced binary tree of n nodes has height about log2n, whereas a degenerate one has height n−1. Since the cost of searching, inserting and deleting in a tree is proportional to its height, a balanced tree gives O(logn) operations while a degenerate one gives O(n) — no better than a linked list. The shape, not just the contents, decides performance.
It is worth seeing why a balanced tree's height grows so slowly. Each level of a binary tree can hold up to twice as many nodes as the level above: level 0 has 1 node (the root), level 1 up to 2, level 2 up to 4, and in general level k up to 2k nodes. A perfectly balanced tree of height h therefore holds up to 2h+1−1 nodes — so n nodes fit in a height of only about log2n. Doubling the number of items adds just one extra level. This exponential fan-out is the source of the tree's power: a balanced tree of a million nodes is only about twenty levels deep, so a search touches only around twenty nodes rather than a million. The contrast with the degenerate tree could not be sharper — there, every "level" holds a single node, the fan-out is wasted, and the structure is a linked list wearing a tree's clothes. Recognising that the whole benefit of a tree depends on staying bushy is the central structural insight of the topic, and it is what motivates the self-balancing trees met later.
A general binary tree imposes no rule on where values go. A binary search tree (BST) adds an ordering property that turns the structure into a searchable index:
That single invariant is what makes a BST useful: at each node you can discard an entire subtree with one comparison — if the value you want is smaller than the current node, it can only be on the left, so the whole right subtree is eliminated. In a balanced BST this halving at every step is what delivers O(logn) search, exactly mirroring binary search on a sorted array (a link explored in the algorithms course). The BST is, in effect, "binary search made into a data structure".
Inserting 8, 3, 10, 1, 6, 14, 4, 7, 13 in that order produces:
flowchart TB
N8((8)) --> N3((3))
N8 --> N10((10))
N3 --> N1((1))
N3 --> N6((6))
N10 --> N14((14))
N6 --> N4((4))
N6 --> N7((7))
N14 --> N13((13))
Each new value starts at the root and moves left if smaller, right if larger, until it reaches an empty (null) position, where it is inserted as a new leaf.
| Step | Value | Path taken | Inserted as |
|---|---|---|---|
| 1 | 8 | (tree empty) | root |
| 2 | 3 | 3 < 8 → left | left child of 8 |
| 3 | 10 | 10 > 8 → right | right child of 8 |
| 4 | 1 | 1 < 8 → left (3); 1 < 3 → left | left child of 3 |
| 5 | 6 | 6 < 8 → left (3); 6 > 3 → right | right child of 3 |
| 6 | 14 | 14 > 8 → right (10); 14 > 10 → right | right child of 10 |
| 7 | 4 | left (3); right (6); 4 < 6 → left | left child of 6 |
| 8 | 7 | left (3); right (6); 7 > 6 → right | right child of 6 |
| 9 | 13 | right (10); right (14); 13 < 14 → left | left child of 14 |
The order of insertion determines the shape: the same nine values inserted in sorted order (1, 3, 4, 6, …) would produce a degenerate right-leaning chain instead of the bushy tree above. This is why insertion order, not just the value set, is central to BST behaviour.
The ordering property is not just decoration — it is what lets the structure itself guide a search, and seeing this makes the O(logn) claim concrete. Suppose we look for 7 in the tree above. We start at the root, 8: since 7<8, the value can only be in the left subtree, so we move to 3 and discard the entire right half of the tree in one comparison. At 3, 7>3, so we go right to 6, discarding 3's left subtree. At 6, 7>6, so we go right — and find 7. Three comparisons located the value, and at each step roughly half the remaining nodes were eliminated. That halving-per-comparison is exactly the behaviour of binary search on a sorted array, which is why a balanced BST searches in O(logn). The crucial structural point — independent of any particular algorithm — is that the BST property turns the shape of the tree into a decision procedure: each node's value is a signpost telling the searcher which single subtree could possibly hold the target. (The formal search algorithm that walks these signposts is in the tree-operations lesson.)
Exam Tip: Building a BST from a value list is a near-certain exam task. Insert one value at a time, each time starting at the root and applying "smaller → left, larger → right" until you hit an empty slot. Re-draw or annotate the tree after each insertion so your working is visible.
A traversal is a systematic visit of every node. You should recognise the four standard orders and what each is for; the detailed recursive/iterative algorithms that produce them are developed in the tree-operations lesson and the algorithms course, so they are summarised — not derived — here.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.