You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
An algorithm is the single most important idea in the whole of computer science: it is the precise recipe that turns a vague problem ("sort these names", "find the shortest route") into a sequence of steps a machine can follow without thinking. Every other topic in this course — searching, sorting, graph traversal, recursion, complexity analysis — is ultimately the study of particular algorithms and how to compare them. This lesson lays the foundations: what an algorithm actually is, the disciplined ways we represent and trace one, how we break a hard problem into manageable pieces (decomposition), and the two yardsticks — correctness and efficiency — by which every algorithm in the rest of the course is judged.
By the end you should be able to read and write algorithms in OCR-style pseudocode, draw and interpret a flowchart, dry-run an algorithm with a trace table, and explain in plain language why one algorithm might be "better" than another even before you meet the formal Big-O machinery in a later lesson. These are not abstract skills hoarded for the exam: every professional programmer reaches for exactly this toolkit — sketch the method, trace it on paper, estimate its cost — before committing a single line to a real codebase.
This lesson supports the parts of the OCR H446 specification dealing with the nature of algorithms and algorithm design and analysis (the early part of section 2.3, with links back to 2.2 problem-solving and 1.4 data types/structures). In your own words, the assessable ideas are:
Throughout this course we paraphrase the specification rather than quoting it. You should always cross-check the exact wording against the current OCR H446 specification document.
An algorithm is a finite sequence of well-defined instructions designed to perform a specific task or solve a particular problem. The word predates computing — it derives from the 9th-century mathematician al-Khwārizmī — and the concept is older still: long division is an algorithm you learned at primary school. Computers did not invent algorithms; they simply execute them faster and more reliably than humans.
For a procedure to count as an algorithm it must satisfy five classical properties:
| Property | Description | Why it matters |
|---|---|---|
| Input | Zero or more inputs are supplied from a well-defined set. | Without defined inputs the procedure is ambiguous. |
| Output | At least one output is produced. | An algorithm that produces nothing observable solves nothing. |
| Definiteness | Each step is precisely and unambiguously defined. | "Add a pinch of salt" is fine for a cook but not for a CPU. |
| Finiteness | It must terminate after a finite number of steps. | A procedure that never stops is not an algorithm (see the halting problem, taught later). |
| Effectiveness | Each step must be basic enough to actually be carried out. | "Compute the largest prime" is not effective; "test each number up to n" is. |
These three words are often blurred and the exam rewards precision:
The same algorithm can be implemented as many different programs; the same problem can be solved by many different algorithms. A central skill in this course is choosing the right algorithm for a problem and then justifying that choice.
Real problems are rarely small enough to solve in one leap. Two design techniques tame them.
Decomposition means breaking a problem into smaller sub-problems, each of which is easier to reason about and can be solved (and tested) independently. "Build a school timetable" decomposes into "read the constraints", "assign teachers to classes", "detect clashes", "output the grid". Each sub-problem may decompose further. Decomposition is what makes large software tractable and is the conceptual root of writing subroutines (functions and procedures).
Abstraction means hiding unnecessary detail so you can focus on what matters. When you write sort(list) you are abstracting away how the sort happens. The London Underground map is the classic abstraction: it discards true geography to make the network navigable. Abstraction and decomposition work together — you decompose into sub-problems, then abstract each into a named operation you can use without re-reading its internals.
# OCR-style pseudocode: decomposing "report the average mark"
function getAverage(marks, size)
total = sumArray(marks, size) // sub-problem 1 (abstracted away)
return total / size // sub-problem 2
endfunction
function sumArray(data, size)
total = 0
for i = 0 to size - 1
total = total + data[i]
next i
return total
endfunction
Exam Tip: If a question says "describe how you would decompose this problem", list named sub-problems and say briefly what each does — do not just rewrite the whole thing as one block of code.
Before you can analyse or implement an algorithm you must express it. There are four standard representations; OCR examines the first three most heavily.
Pseudocode is a structured, human-readable description of an algorithm that resembles a programming language but is not tied to any specific syntax. It lets you reason about logic without worrying about a real compiler. OCR uses its own reference conventions, and you should write in that style in the exam:
| Construct | OCR Pseudocode |
|---|---|
| Assignment | variable = value |
| Output | print(value) |
| Input | variable = input("prompt") |
| Selection | if condition then ... elseif ... else ... endif |
| Count-controlled loop | for i = 0 to 9 ... next i |
| Condition-controlled loop | while condition ... endwhile |
| Post-condition loop | do ... until condition |
| Function (returns a value) | function name(params) ... return value ... endfunction |
| Procedure (no return) | procedure name(params) ... endprocedure |
| Array declaration / access | array name[size] / name[index] |
| String length / substring | name.length / name.substring(start, length) |
| Integer division / remainder | a DIV b / a MOD b |
| Boolean operators | AND, OR, NOT |
A worked pseudocode example — finding the maximum of an array:
# OCR-style pseudocode
function findMax(numbers, size)
max = numbers[0]
for i = 1 to size - 1
if numbers[i] > max then
max = numbers[i]
endif
next i
return max
endfunction
A flowchart is a visual representation using standard symbols connected by directed arrows. It is excellent for showing control flow (branching and looping) at a glance.
| Symbol | Shape | Purpose |
|---|---|---|
| Terminator | Rounded rectangle / oval | Start or end of the algorithm |
| Process | Rectangle | An instruction or action |
| Decision | Diamond | A yes/no (Boolean) question — branching |
| Input/Output | Parallelogram | Reading input or displaying output |
| Flow line | Arrow | Direction of flow between steps |
The flowchart below classifies a number as positive, negative or zero. Notice every path eventually reaches a terminator and each decision diamond has exactly two exits:
flowchart TD
Start([START]) --> Input[/Input: number/]
Input --> D1{number > 0?}
D1 -- Yes --> Pos[/Output: 'Positive'/]
D1 -- No --> D2{number < 0?}
D2 -- Yes --> Neg[/Output: 'Negative'/]
D2 -- No --> Zero[/Output: 'Zero'/]
Pos --> End([END])
Neg --> End
Zero --> End
A loop is drawn as a flow line that travels back to an earlier decision. The flowchart below sums the integers 1 to n, showing the characteristic backward edge of a count-controlled loop:
flowchart TD
S([START]) --> Init[total = 0; i = 1]
Init --> Cond{i <= n?}
Cond -- Yes --> Body[total = total + i; i = i + 1]
Body --> Cond
Cond -- No --> Out[/Output total/]
Out --> E([END])
Exam Tip: You must be able to read a flowchart and create one from a description. Common mark-losers: a decision box with one exit, a flow line that goes nowhere, or a missing terminator. Decisions must be Boolean (answerable yes/no).
Structured English uses ordinary English restricted to imperative, numbered steps ("1. Read the list. 2. While items remain, …"). It is the least precise representation but useful for an initial sketch. Program code (e.g. Python) is the most precise but ties you to one language. In the exam you usually move from a structured-English idea, to pseudocode, to (sometimes) code.
Dry-running (also called desk-checking) means manually stepping through an algorithm without a computer to work out what it does, verify correctness, and locate logic errors. The standard tool for this is a trace table: one column per variable (plus columns for any conditions evaluated and for output), and one row per step of execution.
# OCR-style pseudocode
x = 3
y = 5
while x < y
x = x + 1
y = y - 1
endwhile
print(x)
print(y)
| Step | x | y | x < y | Output |
|---|---|---|---|---|
| Initialise x | 3 | - | - | |
| Initialise y | 3 | 5 | - | |
| Test condition | 3 | 5 | true | |
| Loop body | 4 | 4 | - | |
| Test condition | 4 | 4 | false | |
| 4 | 4 | - | 4, 4 |
Output: 4, 4 — the loop runs only once because after the single iteration the condition is already false.
# OCR-style pseudocode
array data[5] = [7, 2, 9, 4, 6]
max = data[0]
for i = 1 to 4
if data[i] > max then
max = data[i]
endif
next i
print(max)
| Step (i) | data[i] | data[i] > max | max | Output |
|---|---|---|---|---|
| Init | - | - | 7 | |
| 1 | 2 | false | 7 | |
| 2 | 9 | true | 9 | |
| 3 | 4 | false | 9 | |
| 4 | 6 | false | 9 | 9 |
Output: 9. Notice how the table makes the single update (at i = 2) visible — exactly the kind of detail trace-table marks reward.
Nested loops are the classic trap in trace questions because the inner loop runs in full for each pass of the outer loop. The algorithm below prints a small multiplication grid:
# OCR-style pseudocode
for row = 1 to 2
for col = 1 to 3
print(row * col)
next col
next row
| Outer (row) | Inner (col) | row * col | Output so far |
|---|---|---|---|
| 1 | 1 | 1 | 1 |
| 1 | 2 | 2 | 1, 2 |
| 1 | 3 | 3 | 1, 2, 3 |
| 2 | 1 | 2 | 1, 2, 3, 2 |
| 2 | 2 | 4 | 1, 2, 3, 2, 4 |
| 2 | 3 | 6 | 1, 2, 3, 2, 4, 6 |
The inner loop executes its three iterations twice — once per outer pass — giving 2 × 3 = 6 print statements. This multiplicative behaviour is exactly why two nested loops over the same data give O(n2) running time, an idea you will formalise in the complexity lesson and meet in practice in bubble sort.
Exam Tip: Trace-table questions carry several marks and are often where careless candidates leak the easiest marks. Show every change, not just the final answer; an error in an early row cascades into every row below it. For nested loops, complete the inner loop fully before incrementing the outer counter.
Every algorithm — no matter how large — is built from just three control-flow constructs. Recognising them is the bridge between flowcharts, pseudocode and real code.
| Construct | Meaning | Pseudocode form | Flowchart shape |
|---|---|---|---|
| Sequence | Steps executed one after another, top to bottom. | Successive lines | Boxes joined by arrows |
| Selection | A choice between alternative paths. | if ... then ... else ... endif | A decision diamond |
| Iteration | Repetition of a block. | for, while, do ... until | A flow line that loops back |
The structured-programming theorem states that any computable algorithm can be written using only these three constructs — no jumps or gotos required. That is why a flowchart and the equivalent pseudocode always map onto one another: a diamond is an if or a loop test; a backward arrow is the loop's repetition; the straight-through boxes are sequence. When you convert one representation to another in the exam, identify which of the three constructs each fragment represents and translate construct by construct.
A small number of patterns recur across almost every algorithm you will meet. Recognising them lets you read unfamiliar code quickly.
Counting (how many items satisfy a condition):
# OCR-style pseudocode
count = 0
for i = 0 to n - 1
if condition(data[i]) then
count = count + 1
endif
next i
Accumulation / running total (sum, product, concatenation):
# OCR-style pseudocode
total = 0
for i = 0 to n - 1
total = total + data[i]
next i
average = total / n
Finding maximum / minimum (running "best so far"):
# OCR-style pseudocode
max = data[0]
for i = 1 to n - 1
if data[i] > max then
max = data[i]
endif
next i
Linear search (the simplest search, expanded in the next lesson):
# OCR-style pseudocode
function linearSearch(data, target, size)
for i = 0 to size - 1
if data[i] == target then
return i
endif
next i
return -1
endfunction
A key idea — and one examiners love to probe — is that the same problem can be solved by different algorithms with very different characteristics. Consider the problem "is the value target present in the array data?". Two correct algorithms solve it:
# OCR-style pseudocode — Algorithm A: linear scan
function containsLinear(data, target, size)
for i = 0 to size - 1
if data[i] == target then
return true
endif
next i
return false
endfunction
# OCR-style pseudocode — Algorithm B: binary search (data must be sorted)
function containsBinary(data, target, size)
low = 0
high = size - 1
while low <= high
mid = (low + high) DIV 2
if data[mid] == target then
return true
elseif target < data[mid] then
high = mid - 1
else
low = mid + 1
endif
endwhile
return false
endfunction
Both return the right answer, so both are correct. But they differ on every other axis: Algorithm A works on any data and does up to n comparisons; Algorithm B requires the data to be sorted in advance and does only about log2n comparisons. Which is "better" depends entirely on context — on whether the data is already sorted, how big it is, and how often you will search it. This is the recurring lesson of the whole course: correctness is the entry ticket; choosing well among correct algorithms is the real skill. Each algorithm you meet from here on should prompt you to ask "what does this assume about its input, and what does that buy me?"
Once you can express an algorithm you can ask the two questions that drive the whole course.
Correctness — does the algorithm produce the right output for all valid inputs (including awkward edge cases such as an empty list, a single element, or a target that is absent)? A useful informal check is to identify the invariant: something that stays true on every loop iteration. In the findMax example the invariant is "max holds the largest value seen so far", which guarantees correctness when the loop ends.
Efficiency — how do the algorithm's time (number of basic operations) and space (extra memory) requirements grow as the input size n grows? We measure growth, not a stopwatch reading, because hardware varies; an algorithm that does roughly n operations will always overtake one that does roughly n² once n is large enough.
| Yardstick | Question it answers | Example contrast |
|---|---|---|
| Correctness | Right answer for every valid input? | A sort that mishandles duplicates is incorrect. |
| Time efficiency | How fast as n grows? | Linear search (O(n)) vs binary search (O(logn)). |
| Space efficiency | How much extra memory? | In-place bubble sort vs merge sort's extra arrays. |
| Termination | Does it always stop? | A while loop with a bad condition may loop forever. |
The intuition is captured by Big-O notation, which you will meet formally in a later lesson. For now, hold this picture: doubling the input might leave a O(logn) algorithm almost unchanged, double the work of an O(n) algorithm, and quadruple the work of an O(n2) one. An algorithm can be correct but unusably slow, or fast but wrong — a good algorithm is both correct and efficient.
To make the difference concrete, the table below shows roughly how many basic operations each growth rate implies as the input size n climbs. The gap is not academic: at n = 1,000,000 a O(logn) algorithm finishes in about 20 steps while a O(n2) algorithm needs a trillion.
| n | O(logn) | O(n) | O(nlogn) | O(n2) |
|---|---|---|---|---|
| 10 | ~3 | 10 | ~33 | 100 |
| 1,000 | ~10 | 1,000 | ~10,000 | 1,000,000 |
| 1,000,000 | ~20 | 1,000,000 | ~20,000,000 | 1,000,000,000,000 |
This is why algorithm choice matters more than raw hardware: no amount of faster silicon rescues an O(n2) method on large data, whereas switching to an O(nlogn) algorithm transforms a problem from "runs overnight" to "runs in a blink". You will return to this table mentally every time you compare two algorithms in this course.
A second, subtler point: efficiency often involves a trade-off between time and space. Binary search is fast but needs the data sorted and stored in an indexable array; a hash table can search in roughly constant time but spends extra memory on empty slots. There is rarely a single "best" algorithm — only the best one for this problem, this data size, and these constraints. Identifying that trade-off explicitly is a hallmark of top-band exam answers.
(a) State two properties that a sequence of instructions must have in order to be classed as an algorithm. (2 marks)
(b) A programmer is writing software to manage a sports club. Explain how decomposition could be used in the design of this software. (3 marks)
(c) Study the algorithm below.
# OCR-style pseudocode
total = 0
count = 0
for i = 0 to 4
if data[i] MOD 2 == 0 then
total = total + data[i]
count = count + 1
endif
next i
if count > 0 then
print(total / count)
else
print("none")
endif
The array is data = [3, 8, 5, 4, 6]. Complete a trace table for the algorithm and state the output. (5 marks)
(a) It must be finite (it stops) and each step must be clearly defined.
(b) Decomposition means breaking the problem into smaller parts. The programmer could split the club software into a part for members, a part for fixtures and a part for payments. Each part is easier to write on its own.
(c)
| i | data[i] | even? | total | count |
|---|---|---|---|---|
| 0 | 3 | no | 0 | 0 |
| 1 | 8 | yes | 8 | 1 |
| 2 | 5 | no | 8 | 1 |
| 3 | 4 | yes | 12 | 2 |
| 4 | 6 | yes | 18 | 3 |
Output: 18 / 3 = 6.
(a) Finiteness — the algorithm must terminate after a finite number of steps rather than running forever. Definiteness — every step must be precisely and unambiguously defined so it can be carried out without interpretation. (Either of input, output or effectiveness would also be creditworthy.)
(b) Decomposition is the design technique of breaking a large problem into smaller, self-contained sub-problems that can each be solved and tested independently and then combined. For the sports-club software the overall task "manage the club" could be decomposed into discrete modules: a membership module (add, edit and search members), a fixtures module (schedule matches and detect clashes), a payments module (record subscriptions and produce reminders) and a reporting module. Each module can be developed by a different person, tested in isolation, and reused; for example the membership search could itself be decomposed further into "validate the search term" and "perform the search". This reduces complexity, improves maintainability, and lets the team work in parallel — the practical pay-off of decomposition.
(c) Tracing line by line, the loop accumulates only even values:
| i | data[i] | data[i] MOD 2 == 0 | total | count |
|---|---|---|---|---|
| (init) | - | - | 0 | 0 |
| 0 | 3 | false | 0 | 0 |
| 1 | 8 | true | 8 | 1 |
| 2 | 5 | false | 8 | 1 |
| 3 | 4 | true | 12 | 2 |
| 4 | 6 | true | 18 | 3 |
After the loop count = 3 > 0, so the algorithm prints total / count = 18 / 3. Output: 6. (The algorithm computes the mean of the even numbers in the array; the else branch guards against division by zero when no even numbers are present.)
The mid-band answer is correct throughout and would score well, but it leaves marks on the table: in (b) it names modules without explaining why decomposition helps (independent testing, parallel working, reuse), and in (c) it omits the initialisation row and the condition column, so a single arithmetic slip would be hard to award method marks for. The top-band answer earns full credit by being explicit about each property in (a), by justifying the benefits of decomposition in (b), and by showing a complete, well-headed trace in (c) — including the initialisation row, the Boolean condition column, and a closing sentence that interprets what the algorithm actually computes and why the else guard exists. Stating the purpose of an algorithm after tracing it is a reliable way to demonstrate the analytical understanding AO3 rewards.
Exam Tip: Almost every H446 paper contains a trace-table question and at least one "describe/explain the algorithm" question. Drill both: trace meticulously with full columns, and when describing an algorithm always state what problem it solves before how it works.