You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Writing code that looks right and writing code that is right are different achievements, and the gap between them is closed by testing. This lesson takes the programmer's view of testing — the activities you carry out as you build, especially in your own OCR coursework: unit testing individual subroutines in isolation, integration testing how they fit together, test-driven development (TDD) where the tests come first, the disciplined selection of test data (normal, boundary, erroneous), and assertions as a lightweight way to bake checks into code. We deliberately keep the focus on what a programmer does at the keyboard; the broader development-process testing — alpha/beta release testing, and the black-box/white-box framing used when a whole system is verified against its specification — is owned by the Software Systems course, and is cross-linked here rather than re-taught.
The reason this matters so much at A-Level is the NEA programming project: a substantial chunk of the coursework marks come from evidencing a sound testing process — a test plan with well-chosen data, tests run at the right granularity, and results that demonstrate the solution meets its requirements. So treat this lesson as practical advice for your own project as much as exam theory. A recurring principle: a test is only as good as its data, and the data that catches the most bugs lives at the boundaries and in the erroneous cases — not in the comfortable middle. Examples are in ```python with assertions used throughout.
Within H446 2.2.1 this lesson covers programmer-level testing. You should be able to:
These points paraphrase the specification; nothing is quoted verbatim.
Testing is not a phase tacked on at the end; competent programmers test continuously. The reasons:
A useful framing: verification asks "are we building the product right?" (does the code meet the spec) and validation asks "are we building the right product?" (does it meet the user's real need). Programmer-level testing is mostly verification; acceptance testing with the user — the Software Systems topic — is validation.
Unit testing tests the smallest testable parts — usually individual subroutines (functions/methods) — in isolation from the rest of the program. You feed a unit known inputs and check it returns the expected output, with nothing else in the way.
def add(a: int, b: int) -> int:
return a + b
# Unit tests for add(), using assertions
assert add(2, 3) == 5 # normal case
assert add(-1, 1) == 0 # mix of signs
assert add(0, 0) == 0 # zeros
The power of isolation is that a failing unit test points at one subroutine. Each test should ideally check one behaviour, so that when it fails you know exactly what broke.
| Strength of unit testing | Limitation |
|---|---|
| Fast to run; can run hundreds in seconds | Does not test how units interact |
| Pinpoints the faulty subroutine | Needs many tests for good coverage |
| Easy to automate and re-run | A unit can pass alone yet fail when combined |
A natural question on isolation: what if the unit you want to test calls another unit that is not finished, or is slow (a database, a network)? You replace that dependency with a simple stub — a stand-in that returns a fixed value — so the unit under test can be exercised alone. This is the same stub idea used in integration testing below.
Units that each pass in isolation can still fail together — a subroutine might return a list where the caller expects a dictionary, or two modules might disagree about units (pounds versus pence). Integration testing tests the interfaces and interactions between units that have already been unit tested.
| Strategy | How it proceeds | Helper used |
|---|---|---|
| Top-down | Test high-level modules first | Stubs stand in for lower modules not yet ready |
| Bottom-up | Test low-level modules first | Drivers call the modules from above |
| Big-bang | Combine everything, then test | none — risky, faults are hard to isolate |
# Two units, each unit-tested in isolation first...
def to_pence(pounds: float) -> int:
return round(pounds * 100)
def format_price(pence: int) -> str:
return f"£{pence // 100}.{pence % 100:02d}"
# ...then an INTEGRATION test that the data passes correctly BETWEEN them
assert format_price(to_pence(3.5)) == "£3.50" # 3.5 -> 350 pence -> "£3.50"
assert format_price(to_pence(0.0)) == "£0.00"
The integration test above would catch a mismatch the individual unit tests cannot — for instance, if to_pence returned pounds but format_price expected pence, each could still pass alone while the pair produces nonsense. For the NEA, integrating in small steps (a couple of modules at a time) makes faults far easier to pin down than a big-bang combine at the end.
Integration rarely happens all at once, because some modules a unit depends on may be unwritten, slow, or awkward to run during a test. Two stand-ins keep testing moving:
# We want to test send_report(), but the real database is not ready.
# A STUB stands in for the lower module and returns a known value.
def get_total_sales_stub():
return 1000 # fixed, predictable result — not the real query
def send_report(get_total): # the module under test takes its dependency in
total = get_total()
return f"Total sales: £{total}"
# Integration test using the stub in place of the real database call
assert send_report(get_total_sales_stub) == "Total sales: £1000"
The point is that send_report can be integration-tested now, against a predictable stub, without waiting for — or depending on the correctness of — the real sales query. Top-down integration leans on stubs (test the top, fake the bottom); bottom-up integration leans on drivers (test the bottom, fake the top). Knowing which helper goes with which strategy is a small but examinable point, and for the NEA stubs are a practical way to test a module whose dependencies are not finished.
The single most examinable testing skill is choosing test data. Good data deliberately probes the edges and the invalid, because that is where bugs hide; running ten comfortable middle-of-the-range values proves very little.
| Category | What it is | Example for a mark field valid 0–100 |
|---|---|---|
| Normal / valid | Typical values well inside the valid range | 45, 72, 88 |
| Boundary | Values on the edges of the valid range (and just outside) | 0, 1 and 100 (valid); -1 and 101 (just invalid) |
| Erroneous / invalid | Values that should be rejected — out of range or wrong type | -5, 150, "abc", an empty entry |
Most logic errors are off-by-one — a < written where <= was meant — and these only show up at the boundary. A function that accepts marks "up to 100" might wrongly reject exactly 100 or wrongly accept 101; only a test at 100 and at 101 reveals it. That is why mark schemes (and the NEA) specifically reward boundary data at both ends.
def is_valid_mark(mark) -> bool:
return isinstance(mark, int) and 0 <= mark <= 100
# Normal
assert is_valid_mark(45) == True
# Boundary (the edges that catch off-by-one bugs)
assert is_valid_mark(0) == True
assert is_valid_mark(100) == True
assert is_valid_mark(-1) == False
assert is_valid_mark(101) == False
# Erroneous (wrong type / way out of range)
assert is_valid_mark("abc") == False
assert is_valid_mark(500) == False
Exam Tip: When asked for test data, give a value from each category and say which is which. Always include both the minimum and maximum boundary, plus one just outside each. Marks are routinely awarded specifically for boundary cases — a plan of only "normal" data scores poorly.
An assertion states a condition the programmer believes must be true at that point; if it is false, the program stops immediately with an error. Assertions turn an expectation into an executable check, and they are the building block of the unit tests above.
def average(values: list) -> float:
assert len(values) > 0, "average() requires a non-empty list" # precondition
result = sum(values) / len(values)
assert 0 <= result <= 100 or True # (example post-check on a known range)
return result
There are two distinct uses, and it is worth separating them:
assert add(2, 3) == 5 checks a unit behaves correctly — this is the everyday unit-test form;assert inside a function documents and enforces an assumption (a precondition such as "the list is not empty", or a postcondition on the result) during development.Assertions are a development tool — they catch programmer mistakes and broken assumptions early and loudly. They are not a substitute for proper validation of user input (the defensive programming of the exception-handling lesson): user-facing checks must handle bad input gracefully, whereas a failed assertion is meant to crash so the programmer notices the bug.
In test-driven development you write the test first, watch it fail, then write just enough code to pass it. The cycle is red–green–refactor:
flowchart LR
R["RED: write a failing test<br/>for the next small behaviour"] --> G["GREEN: write the minimum<br/>code to make it pass"]
G --> F["REFACTOR: clean up the code,<br/>tests still green"]
F --> R
# RED: write the test first — is_palindrome does not exist yet, so this fails
def test_is_palindrome():
assert is_palindrome("racecar") == True
assert is_palindrome("hello") == False
assert is_palindrome("") == True # decide the empty-string case up front
# GREEN: write the minimum code to pass the test
def is_palindrome(text: str) -> bool:
return text == text[::-1]
# REFACTOR: improve names/structure while keeping the test green
The discipline has real benefits: tests exist from the very start so coverage is built-in; writing the test first forces you to decide the expected behaviour (including edge cases like the empty string) before coding; and the tests act as living documentation and a safety net for later change. The trade-off is that it requires discipline and can feel slow up front, and a poorly chosen test can lock in a bad design. For the NEA, even a light version — write a couple of expected results before coding a tricky subroutine — sharpens your thinking about what "correct" means.
Whenever you change code — fix a bug, add a feature, refactor — you risk breaking something that used to work. Regression testing simply means re-running your existing tests after every change to confirm nothing has regressed. Because the tests already exist, this is cheap, and it is the safety net that makes ongoing development safe:
# Keep ALL earlier tests and re-run them after every change
def run_all_tests():
assert is_valid_mark(100) == True # an old test...
assert is_valid_mark(101) == False
assert format_price(to_pence(3.5)) == "£3.50"
print("All tests passed") # if any assert fails, we regressed
In professional practice this is automated and run on every commit (continuous integration). For the NEA, keeping a runnable set of tests and re-running them as you develop is exactly the kind of disciplined process the mark scheme rewards.
The testing above is what you do while programming. The wider development process adds further testing stages that are framed from the project's point of view rather than the programmer's, and those are covered in the Software Systems course:
| Stage / framing | Owned by | One-line reminder |
|---|---|---|
| System testing | Software Systems | the whole integrated system against the requirements spec |
| Acceptance / alpha / beta | Software Systems | client and user testing before/at release |
| Black-box vs white-box | Software Systems | testing by spec (inputs→outputs) vs by code structure (paths/branches) |
It is still worth knowing the headline distinction so you can place your own work: black-box testing chooses data from the specification without looking at the code (boundary and erroneous data are black-box choices), whereas white-box testing uses knowledge of the code's structure to ensure every branch and statement is exercised. Your unit tests are usually a blend — you pick inputs from the spec (black-box) but also add cases to hit every branch (white-box). For the full treatment of those terms and the alpha/beta release stages, see Software Systems.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.