You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
This lesson covers software testing from the systems/development-process angle — when in the lifecycle tests happen, who runs them, what kind of access they assume (black/white/grey box), and how to choose disciplined test data — together with the debugging tools a developer uses to locate a fault once a test fails. The programmer-level detail of writing unit and integration tests in a framework belongs to the Programming course; this lesson cross-links to that and owns the process view.
This lesson develops OCR H446 section 1.2.3 (Software Development — testing). It requires you to distinguish testing by stage (alpha and beta; iterative testing during development versus final/acceptance testing at the end), to compare black-box, white-box and grey-box approaches by how much of the internals the tester can see, to select test data of the three kinds (normal, boundary, erroneous) for a given scenario and state the expected outcome of each, and to describe the debugging facilities (breakpoints, watches, single-step/trace, call-stack inspection) used to diagnose faults. The companion Programming course owns framework-level unit/integration test implementation; here we own the testing process and the choice of test data.
Testing exists to:
The verify/validate distinction is examined: verification checks the product against the specification; validation checks it against the user's real need. A program can pass verification (it matches the spec) yet fail validation (the spec was wrong).
A useful first split is when testing happens relative to development.
| Iterative (developmental) testing | Final / acceptance testing | |
|---|---|---|
| When | Continuously, during development, as each module is built and changed | Once, after development, before the system goes live |
| Who | The developers themselves | An independent test team and, for acceptance, the client/end users |
| Purpose | Catch defects early; confirm each new piece works before moving on | Confirm the finished system meets requirements and is acceptable to deploy |
| Mindset | "Is this part right yet?" | "Is the whole thing fit for release?" |
Iterative testing is woven through the build: you test a module, fix it, extend it, test again — a tight loop that surfaces problems while they are cheap to fix. This is exactly the rhythm Agile and XP formalise (lesson #7). Final testing is the gate at the end; its most important form is acceptance testing, where the client confirms the system does what they actually contracted for. Acceptance testing is the moment validation (not just verification) is signed off.
Within development, testing is applied at increasing scope, from a single component to the whole system.
| Aspect | Detail |
|---|---|
| What is tested | Individual components — typically a single function, method or class |
| Who does it | The developer who wrote the code |
| When | During development, as each unit is written (iterative) |
| Goal | Verify that each unit works correctly in isolation |
| Tools | Automated testing frameworks (e.g. JUnit, pytest, NUnit) |
| Aspect | Detail |
|---|---|
| What is tested | The interactions between two or more units/modules once combined |
| Who does it | Developers or a dedicated testing team |
| When | After unit testing, as modules are combined |
| Goal | Verify that modules work together — that data passes correctly between components |
| Common faults found | Interface mismatches, incorrect data formats, timing issues, missing error handling at module boundaries |
| Approach | Description |
|---|---|
| Top-down | Start with the highest-level module and integrate downward, using stubs (dummy modules) to stand in for lower modules not yet built |
| Bottom-up | Start with the lowest-level modules and integrate upward, using drivers (dummy callers) to invoke modules not yet integrated |
| Big bang | Integrate everything at once and test — simple, but a fault is hard to isolate among many new interactions |
| Aspect | Detail |
|---|---|
| What is tested | The complete, integrated system as a whole |
| Who does it | A dedicated test team (independent of the developers) |
| When | After integration testing |
| Goal | Verify the entire system meets the requirements specification |
| Types | Functional (does it do what it should?), performance (is it fast enough?), security, usability |
| Aspect | Detail |
|---|---|
| What is tested | The system against the client's real-world requirements |
| Who does it | The client or end users |
| When | After system testing, before deployment |
| Goal | Validate that the system meets the client's actual needs and is acceptable to deploy |
| Types | User Acceptance Testing (UAT) — end users exercise it in a realistic scenario; contract acceptance — checked against the agreed contract specification |
The natural sequence is therefore unit → integration → system → acceptance, narrowing from "does this function work?" to "will the client sign it off?".
These three classify testing by how much of the internal code the tester can see.
| Aspect | Detail |
|---|---|
| Approach | The tester does not see the internal code; tests are derived only from the inputs and expected outputs in the specification |
| Also called | Functional / specification-based testing |
| Advantage | Tests the software from the user's perspective; no programming knowledge needed |
| Disadvantage | May miss internal faults that happen not to affect the outputs tested |
| Typically used at | System and acceptance testing |
| Aspect | Detail |
|---|---|
| Approach | The tester has the source code and designs tests to exercise specific code paths, branches and conditions |
| Also called | Structural / glass-box testing |
| Coverage techniques | Statement coverage (every line run at least once), branch coverage (every if/else path taken), path coverage (every route through the code) |
| Advantage | Thorough — can drive untested branches and reveal dead or unreachable code |
| Disadvantage | Time-consuming; cannot reveal a missing feature, because it only tests code that exists |
| Typically used at | Unit and integration testing |
| Aspect | Detail |
|---|---|
| Approach | A hybrid — the tester has partial knowledge of internals (e.g. the data structures, the database schema or the architecture) but tests largely through the external interface |
| Why it exists | Combines black-box's user-realistic inputs with just enough internal insight to design smarter tests (e.g. deliberately targeting a value that is known internally to be a special case) |
| Typical use | Integration testing and security/penetration testing, where some design knowledge sharpens otherwise external tests |
| Feature | Black-box | Grey-box | White-box |
|---|---|---|---|
| Code access | None | Partial (structures/schema) | Full source |
| Tests based on | Specification | Spec + some internals | Code structure |
| Tester knowledge | No programming needed | Some design knowledge | Must read the code |
| Best for | Functional correctness, acceptance | Integration, security | Coverage, logic errors |
The white-box coverage terms are best understood on a tiny routine. Consider a function that classifies a number:
def classify(n):
label = "non-negative"
if n < 0:
label = "negative"
return label
A single test, classify(5), already runs every line (the if line is executed even though its body is skipped), so it achieves 100% statement coverage. Yet it never makes the condition n < 0 true, so the label = "negative" branch is untested — branch coverage is only 50%. To reach 100% branch coverage you need a second test, classify(-3), that takes the true branch:
| Test | Input | Branch taken | Statement coverage | Branch coverage |
|---|---|---|---|---|
| 1 only | 5 | if false | 100% | 50% |
| 1 + 2 | 5, -3 | both | 100% | 100% |
The lesson is that statement coverage is weaker than branch coverage: you can execute every line without exercising every decision outcome. This is why thorough white-box testing aims for branch (or even path) coverage, and why "we ran every line" is not the same as "we tested every case" — a distinction examiners like to probe.
Alpha and beta classify late-stage testing by who tests and where.
| Aspect | Detail |
|---|---|
| When | Before release to any external users |
| Who | Internal staff or a selected internal group |
| Environment | Controlled — usually the developer's own environment |
| Goal | Catch defects before exposing the product externally; confirm core functionality |
| Aspect | Detail |
|---|---|
| When | After alpha — released to a limited group of external users |
| Who | Real users outside the organisation |
| Environment | Uncontrolled — users' own hardware, OS and configurations |
| Goal | Surface faults that only appear in real-world conditions; gather usability feedback |
| Feature | Alpha | Beta |
|---|---|---|
| Testers | Internal | External users |
| Environment | Controlled (developer site) | Uncontrolled (user site) |
| Stage | Earlier | Later |
| Focus | Core functionality, critical bugs | Real-world issues, usability, edge cases |
The reason beta testing is valuable is precisely that it is uncontrolled: real users on unpredictable hardware and with unpredictable habits exercise combinations the developers never imagined — exactly the gap black-box testing in a clean lab leaves open.
Thorough testing uses three kinds of test data. The skill the exam rewards is choosing specific values of each kind for a scenario and stating the expected outcome.
| Data type | What it is | Purpose |
|---|---|---|
| Normal (valid) | Typical, everyday inputs the program should accept | Confirm correct behaviour under ordinary conditions |
| Boundary (extreme) | Values right at the edges of the valid range, on both sides | Catch off-by-one errors — the most common defect at boundaries |
| Erroneous (invalid) | Inputs the program should reject (out of range, wrong type) | Confirm the program rejects bad input gracefully rather than crashing |
Scenario: a function validates a school exam mark, which must be an integer from 0 to 100 inclusive. Design a test table.
| Test | Input | Type | Expected outcome |
|---|---|---|---|
| 1 | 57 | Normal | Accepted as a valid mark |
| 2 | 0 | Boundary (lower limit, valid) | Accepted |
| 3 | 100 | Boundary (upper limit, valid) | Accepted |
| 4 | -1 | Boundary (just below, invalid) | Rejected with "out of range" message |
| 5 | 101 | Boundary (just above, invalid) | Rejected with "out of range" message |
| 6 | 500 | Erroneous (far out of range) | Rejected with "out of range" message |
| 7 | "pass" | Erroneous (wrong type) | Rejected with "must be a whole number" message |
| 8 | "" (empty) | Erroneous (missing input) | Rejected with "value required" message |
Notice the pairing at each boundary: the valid edge (0, 100) and the first invalid value beyond it (−1, 101). Testing both sides of a boundary is what catches the classic < vs <= mistake. Every row also states an expected outcome, which is what makes a test table usable: a test with no expected result cannot pass or fail.
Exam Tip: When asked to provide test data, always give all three kinds, include values on both sides of each boundary, and state an expected outcome for every value. A bare list of numbers with no expected results scores poorly.
Testing tells you that a defect exists; debugging is the process of locating why. A modern IDE debugger (lesson #10) provides a toolkit for this.
| Tool | What it does | How it helps find the bug |
|---|---|---|
| Breakpoint | Marks a line where execution pauses | Lets you stop just before the suspected fault and inspect state |
| Conditional breakpoint | Pauses only when a condition is true (e.g. i == 999) | Skips thousands of irrelevant iterations to reach the failing case |
| Single-step / step over | Runs one line, then pauses again | Follow control flow line by line without diving into called routines |
| Step into / step out | Enter a called routine / finish it and return | Investigate or skip the internals of a function call |
| Watch | Continuously displays a chosen variable/expression | See the exact moment a value becomes wrong |
| Call stack | Shows the chain of calls that reached the current line | Identify which caller led here when a routine misbehaves |
| Trace / trace table | A line-by-line record of variable values | Reconstruct execution; the manual paper version is examinable |
Suppose this function is meant to sum the numbers 1 to n but returns the wrong total:
def sum_to(n):
total = 0
for i in range(1, n): # bug: range(1, n) stops at n-1
total = total + i
return total
A disciplined debug: set a breakpoint on the return line and a watch on total and i. Run sum_to(5), expecting 1+2+3+4+5 = 15. Single-stepping the loop and recording a trace table:
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.