You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Recursion is one of the most elegant — and most exam-tested — ideas in programming: a subroutine that solves a problem by calling itself on a smaller version of the same problem, until the problem becomes so small it can be answered outright. It feels almost circular the first time you meet it, yet it underpins a huge swathe of computer science: the divide-and-conquer sorts (merge sort, quicksort), every tree and graph traversal, the parsing of nested structures, and the natural definition of mathematical sequences. Wherever a problem contains a smaller copy of itself, recursion is the most direct way to express the solution.
This lesson teaches you to read, write, and — crucially — trace recursive subroutines. You will pin down the two non-negotiable ingredients (a base case and a recursive case), watch the call stack grow and unwind through a mermaid diagram of stack frames, work through the classic examples (factorial, Fibonacci, and a recursive binary search), and weigh recursion honestly against iteration. The call stack is itself a stack data structure — last-in-first-out — taught in the Stacks lesson of the OCR Data Structures course; this lesson explains how recursion uses that stack, and cross-references it rather than re-teaching the structure.
This lesson supports the OCR H446 content on recursion within programming techniques (2.2) and its application in standard algorithms (2.3). Paraphrased, the assessable ideas are:
This is a paraphrase; always cross-check against the current OCR H446 specification.
Recursion is a technique in which a subroutine calls itself to solve a smaller instance of the same problem. Every correct recursive routine must contain two parts:
| Component | Role |
|---|---|
| Base case | A condition handled directly, with no further self-call. It stops the recursion. |
| Recursive (general) case | The routine calls itself on a smaller or simpler input, making progress towards a base case. |
Miss the base case and the routine never stops; fail to make the recursive case smaller and it never reaches the base case. Either way you get infinite recursion and, eventually, a crash. A useful mental test for any recursive routine is to ask two questions: "What is the simplest input, and is it handled without recursing?" (base case) and "Does every recursive call move strictly closer to that simplest input?" (progress).
A helpful way to trust a recursive routine without tracing every call is the "leap of faith" (formally, proof by induction): assume the routine already works correctly for all smaller inputs, then check just two things — that the base case is right, and that the recursive case correctly builds the answer for n out of the (assumed-correct) answer for the smaller input. If both hold, the routine is correct for every input by induction. For factorial: the base case 0!=1 is right, and if factorial(n-1) correctly returns (n−1)! then n * factorial(n-1) correctly returns n! — done. This is not just mathematical decoration; thinking inductively is how experienced programmers write correct recursion in the first place, rather than mentally simulating a deep stack of calls.
Think of a set of Russian nesting dolls (matryoshka). To find the solid innermost doll you open the outer doll (a recursive step), then the next, and the next — each step the same action on a smaller doll — until you reach a doll that will not open (the base case). The whole task is defined in terms of a smaller copy of itself, which is the essence of recursion. Two other everyday recursions are worth keeping in mind: holding a mirror up to another mirror produces an image-within-an-image-within-an-image (which would recurse forever — it has no base case, just as a recursive routine without one never stops), and looking up an unfamiliar word in a dictionary, only to find its definition uses another unfamiliar word that you must look up in turn, until you finally reach words you already know (the base case). These pictures make the abstract idea concrete before any code is involved.
# OCR-style pseudocode
function solve(problem)
if problem is simple enough then
return simple answer // BASE CASE
else
simplify the problem
return solve(simpler problem) // RECURSIVE CASE
endif
endfunction
The factorial of n (written n!) is the product of all integers from 1 to n. It has a naturally recursive definition:
0!=1(base case),n!=n×(n−1)!for n>0(recursive case)
# OCR-style pseudocode
function factorial(n)
if n == 0 then
return 1
else
return n * factorial(n - 1)
endif
endfunction
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n - 1)
print(factorial(5)) # Output: 120
The substitution below shows the winding phase (calls stacking up) followed by the unwinding phase (results multiplying back together):
factorial(5)=5×factorial(4)=5×4×factorial(3)=5×4×3×factorial(2)=5×4×3×2×factorial(1)=5×4×3×2×1×factorial(0)=5×4×3×2×1×1=120The key insight is that nothing is multiplied until the base case is reached: each call is suspended, waiting for the result of the call beneath it. Only when factorial(0) returns 1 does the chain of multiplications resolve, from the bottom of the stack upward.
When any subroutine calls another (recursion is just the special case of calling itself), the computer pushes a stack frame onto the call stack. Each frame stores:
Because the call stack is last-in-first-out (LIFO), the most recent call is always the next to complete — exactly the behaviour we need for recursion to unwind in the right order. (The stack as a data structure, with its push/pop operations and overflow behaviour, is taught in the Stacks lesson of the Data Structures course; here we focus on how recursion drives it.)
The diagram below shows the stack frames for factorial(3) at their deepest point — the moment the base case is about to return — with the most recent call at the top:
flowchart TB
F0["factorial(0) — base case, returns 1 (TOP, newest)"]
F1["factorial(1) — waiting on factorial(0)"]
F2["factorial(2) — waiting on factorial(1)"]
F3["factorial(3) — waiting on factorial(2) (BOTTOM, oldest)"]
F0 --> F1 --> F2 --> F3
Tracing the push/pop sequence for factorial(3):
| Step | Event | Call stack (top → bottom) |
|---|---|---|
| 1 | factorial(3) called | factorial(3) |
| 2 | factorial(3) calls factorial(2) | factorial(2), factorial(3) |
| 3 | factorial(2) calls factorial(1) | factorial(1), factorial(2), factorial(3) |
| 4 | factorial(1) calls factorial(0) | factorial(0), factorial(1), factorial(2), factorial(3) |
| 5 | factorial(0) returns 1 (pop) | factorial(1), factorial(2), factorial(3) |
| 6 | factorial(1) returns 1 × 1 = 1 (pop) | factorial(2), factorial(3) |
| 7 | factorial(2) returns 2 × 1 = 2 (pop) | factorial(3) |
| 8 | factorial(3) returns 3 × 2 = 6 (pop) | (empty) |
Steps 1–4 are the winding phase (frames pushed as calls descend); steps 5–8 are the unwinding phase (frames popped as results return). The deepest the stack ever gets — here four frames — is the recursion depth, and it is precisely what determines a recursive routine's space cost.
It is worth being explicit about why the return address matters, because it is the mechanism that makes recursion work at all. When factorial(2) calls factorial(1), the processor must remember that, once factorial(1) produces its result, execution should resume inside factorial(2) at the point of the multiplication 2 * (…). That "resume here" location is the return address, stored in factorial(2)'s frame. Each frame also keeps its own copy of the parameter n — factorial(2)'s frame holds n = 2 while factorial(1)'s holds n = 1 — which is why the calls do not clobber one another even though they share a name. Without per-call frames each storing their own locals and return address, the multiplications could not be reassembled in the correct order on the way back up. This is the deep reason recursion needs a stack: it is the only structure whose last-in-first-out discipline returns control to the suspended calls in exactly the reverse of the order they were made.
Each frame consumes memory, and the call stack has a finite size. If recursion never reaches a base case — or the input is so large that the depth is enormous — the stack fills up and the program crashes with a stack overflow. The simplest way to trigger it is to omit the base case entirely:
def infinite(n):
return infinite(n + 1) # no base case — depth grows without bound
infinite(1) # StackOverflow / RecursionError
| Avoiding stack overflow | How |
|---|---|
| Provide a base case | Every recursive routine needs at least one stopping condition. |
| Guarantee progress | Each recursive call must move strictly towards a base case. |
| Bound the depth | Many languages cap recursion depth (Python's default is around 1000). |
| Rewrite iteratively | For very deep recursion, an explicit loop avoids the stack entirely. |
The Fibonacci sequence (0, 1, 1, 2, 3, 5, 8, 13, …) is defined recursively with two base cases:
fib(0)=0,fib(1)=1,fib(n)=fib(n−1)+fib(n−2) for n>1
def fibonacci(n):
if n == 0:
return 0
elif n == 1:
return 1
else:
return fibonacci(n - 1) + fibonacci(n - 2)
print(fibonacci(7)) # Output: 13
Unlike factorial, which makes a single chain of calls, Fibonacci makes two recursive calls per step, so the calls form a branching tree. Tracing fibonacci(4):
flowchart TB
A["fib(4)"] --> B["fib(3)"]
A --> C["fib(2)"]
B --> D["fib(2)"]
B --> E["fib(1) = 1"]
C --> F["fib(1) = 1"]
C --> G["fib(0) = 0"]
D --> H["fib(1) = 1"]
D --> I["fib(0) = 0"]
Notice that fib(2) is computed twice and fib(1) three times. This duplicated work explodes as n grows: naive recursive Fibonacci performs roughly O(2n) calls, recomputing the same subproblems over and over. The substitution trace confirms the value:
Why this matters. Fibonacci is the standard cautionary tale: a recursive solution can be correct and beautiful yet hopelessly inefficient. The cure — memoisation (caching results) or an iterative loop — brings it down to O(n). Recognising when recursion causes repeated work is exactly the kind of analytical judgement examiners reward.
The contrast between Fibonacci and factorial is instructive and a favourite exam discriminator. Factorial recurses in a single chain: each call makes exactly one further call, so the recursion tree is a straight line of depth n, giving O(n) time and O(n) stack space. Fibonacci recurses in a branching tree: each call makes two, so the tree roughly doubles in width at each level and contains about 2n nodes — hence the exponential time. Yet Fibonacci's stack depth is still only O(n), because at any instant only one root-to-leaf path is "in progress" on the stack; the rest of the tree has either already returned or not yet been started. Being able to separate "how many calls in total" (time) from "how deep does the stack get" (space) is precisely the analytical skill the Big-O lesson formalises, and Fibonacci is the cleanest example where the two answers differ dramatically.
Binary search (met in the searching lesson) is naturally recursive: each step either finds the target or recurses on half the array, shrinking the problem by half every time. The base cases are "found it" and "search range empty (not present)".
def binary_search(data, target, low, high):
if low > high:
return -1 # BASE CASE: range empty, not found
mid = (low + high) // 2
if data[mid] == target:
return mid # BASE CASE: found
elif data[mid] < target:
return binary_search(data, target, mid + 1, high) # recurse right
else:
return binary_search(data, target, low, mid - 1) # recurse left
Tracing a search for 23 in the sorted array [3, 9, 14, 19, 23, 27, 31] (indices 0–6):
| Call | low | high | mid | data[mid] | Decision |
|---|---|---|---|---|---|
| 1 | 0 | 6 | 3 | 19 | 23 > 19 → recurse right |
| 2 | 4 | 6 | 5 | 27 | 23 < 27 → recurse left |
| 3 | 4 | 4 | 4 | 23 | found at index 4 |
This recursive form is worth comparing with the iterative binary search from the searching lesson, which uses a while low <= high loop and updates low/high in place. The two are logically identical and have the same O(logn) time, but the iterative version uses O(1) stack space whereas the recursive one uses O(logn) stack frames. Because logn grows so slowly — only about 30 frames for a billion-element array — the recursive version's stack cost is negligible in practice, which makes binary search a rare case where recursion's usual memory penalty essentially does not bite. That is a useful nuance: "recursion costs stack space" is true, but how much depends entirely on the recursion depth, and a logarithmic depth is nothing to fear.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.