You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Every time data is moved or stored, it travels through an imperfect physical world. A bit racing along a copper wire can be flipped by electrical interference; a region of flash memory can degrade until a 1 reads back as a 0; a scratched optical disc can scatter a laser so a byte is misread. None of this is hypothetical — at the scale of billions of bits per second, some corruption is statistically certain. The engineering response is not to pretend errors never happen but to add redundancy: extra bits, computed from the data, that let the receiver notice — or even repair — corruption. This lesson develops that idea rigorously, from the single parity bit through majority voting, checksums and check digits (with a fully worked ISBN-13 and a modulo-11 example), and finally to the conceptual leap from detecting an error to correcting it.
This lesson covers the error-checking strand of the AQA A-Level Computer Science (7517) Fundamentals of data representation area:
These techniques recur whenever data crosses a boundary — between memory and CPU, between computers on a network, or between a human and a keyboard.
Data corruption has two broad causes:
A raw stream of data bits carries no way to tell a correct bit from a corrupted one — a flipped 1 looks exactly like a legitimate 0. The universal solution is redundancy: the sender computes some extra bits from the data using an agreed rule, and transmits them alongside. The receiver re-applies the rule and checks for consistency. If the data and its redundant bits disagree, an error is detected.
There is always a cost. Redundancy consumes bandwidth or storage that could have carried real data, and stronger schemes cost more. The art of error-control coding is choosing just enough redundancy for the expected error rate. The schemes below sit on a spectrum from cheap-but-weak (one parity bit) to expensive-but-correcting (majority voting and beyond).
A parity bit is a single extra bit appended to a group of data bits (commonly seven, making an eight-bit byte) so that the total number of 1s in the whole group satisfies an agreed rule:
Both ends of the link must agree in advance which scheme is in use.
Suppose the seven data bits are 1011010. Count the 1s in the data: there are four (an even number).
1011010 0.1011010 1.Now take data 1100001 — that has three 1s (odd):
| Scheme | Data 1s | Parity bit chosen | Transmitted byte | Total 1s |
|---|---|---|---|---|
| Even | 3 | 1 | 1100001 1 | 4 (even ✓) |
| Odd | 3 | 0 | 1100001 0 | 3 (odd ✓) |
The receiver counts the 1s in the received byte. Under even parity, a correct byte always has an even count; if the received count is odd, at least one bit flipped in transit and an error is flagged (typically the byte is rejected and retransmission requested).
Worked check: a sender using even parity transmits 1011010 0 (four 1s, even). Interference flips the third bit, so the receiver sees 1010010 0 — now three 1s, an odd count. The parity rule is violated, so the error is caught.
Parity is cheap (one bit) but weak. Its defining weakness:
A single parity bit detects an odd number of bit errors only. If an even number of bits flip (two, four, …), the parity is restored and the error passes undetected.
Worked example of a missed error: the sender transmits 1011010 0 (even parity, four 1s). Now two bits flip — say bits 1 and 3 — giving 0010010 0. Count the 1s: two. That is still even, so the receiver wrongly concludes the byte is fine. The two errors have cancelled out as far as parity is concerned.
Equally important:
Even when parity does detect an error, it cannot locate which bit is wrong, and therefore cannot correct it. It only signals that something is wrong.
This is why parity alone supports detection, not correction. Locating the faulty bit requires more redundancy — which is exactly what the next techniques provide.
Exam Tip: State the limitation precisely. "Parity detects errors" is too loose. The mark-scheme answer is "a single parity bit detects an odd number of bit-flips but cannot detect an even number, and cannot identify which bit is in error (so cannot correct it)."
Majority voting moves from detection to correction by brute-force redundancy: each data bit is transmitted an odd number of times (three is the classic case), and the receiver decides each bit by taking the value held by the majority of its copies.
To send the single bit 1, the sender transmits 111. Suppose noise flips the middle copy, so the receiver gets 101. Counting: two 1s versus one 0 — the majority is 1, so the receiver correctly recovers the intended bit and has corrected the error without retransmission.
Extending to a message, to send the three-bit value 101 with triple redundancy the sender transmits each bit three times: 111 000 111. If transmission corrupts it to 110 010 111, the receiver groups the copies and votes:
| Group received | Votes | Majority decision |
|---|---|---|
110 | two 1s, one 0 | 1 |
010 | two 0s, one 1 | 0 |
111 | three 1s | 1 |
Recovered value: 101 — correct, despite two separate single-bit errors (one in each of the first two groups).
Majority voting is genuinely error-correcting: with triple redundancy it corrects any single bit-flip per group. Its weakness is enormous overhead — tripling the data triples the bandwidth or storage. It also fails if two of the three copies in a group are corrupted, because then the majority is wrong: sending 111 but receiving 001 votes (wrongly) to 0. Triple-redundancy therefore corrects one error per group but can be defeated by two. Higher odd multiples (five, seven copies) tolerate more errors at even greater cost. Because of this expense, majority voting is reserved for situations where retransmission is impossible or catastrophic — deep-space probes, or radiation-hardened spacecraft electronics where a wrong bit could doom a mission.
Exam Tip: Majority voting is the simplest technique on this spec that corrects rather than merely detects, because the redundancy is enough to identify the intended value. Always pair the strength (correction without retransmission) with the weakness (high overhead; fails if the majority of copies are corrupted).
A checksum guards a block of data with a single computed summary value. The sender runs an agreed calculation over all the data in the block, producing the checksum, and sends it alongside the block. The receiver runs the same calculation over the data it received and compares its result with the transmitted checksum. If they differ, the block was corrupted in transit and retransmission is requested.
flowchart LR
A[Data block] --> B[Apply checksum<br/>algorithm at sender]
B --> C[Transmit data<br/>+ checksum]
C --> D[Receiver re-computes<br/>checksum over data]
D --> E{Recomputed =<br/>received?}
E -->|Yes| F[Accept block]
E -->|No| G[Error detected:<br/>request resend]
A simple (illustrative) checksum sums the byte values of the block and keeps the result modulo 256 (so the checksum fits in one byte). Suppose a block contains four bytes with denary values 200, 100, 90 and 75.
200+100+90+75=465 465mod256=465−256=209
The sender transmits the four data bytes plus the checksum byte 209. The receiver adds the four data bytes it received, takes the result mod 256, and checks it equals 209.
Now suppose the second byte arrives corrupted as 101 instead of 100. The receiver computes 200+101+90+75=466, and 466mod256=210=209 — the mismatch flags the error.
A checksum protects a whole block with little overhead (often a single byte or word), and detects most random corruption. But like parity it can be fooled: errors that cancel out leave the sum unchanged. For instance, if one byte is corrupted by +1 and another by −1, the total — and thus this simple checksum — is unchanged and the error slips through. Real protocols therefore use stronger position-sensitive algorithms (such as CRC, cyclic redundancy checks, which treat the data as a polynomial); these are far more robust because reordering or compensating errors still change the result. A checksum, fundamentally, is an error-detection mechanism: it tells you a block is wrong but not where, so it cannot correct the block.
A check digit is a special case of a checksum designed for human-entered, fixed-length codes — barcodes, bank-card numbers, ISBNs. One extra digit is appended, computed from the others by an agreed formula, so that common transcription errors are caught the moment the number is keyed in. The two transcription errors a good check-digit scheme targets are:
Modern books carry a 13-digit ISBN. The 13th digit is a check digit computed from the first twelve using alternating weights of 1 and 3:
Take the first twelve digits 978014103614. Apply alternating weights 1, 3, 1, 3, …:
| Position | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Digit | 9 | 7 | 8 | 0 | 1 | 4 | 1 | 0 | 3 | 6 | 1 | 4 |
| Weight | 1 | 3 | 1 | 3 | 1 | 3 | 1 | 3 | 1 | 3 | 1 | 3 |
| Product | 9 | 21 | 8 | 0 | 1 | 12 | 1 | 0 | 3 | 18 | 1 | 12 |
Sum the products:
S=9+21+8+0+1+12+1+0+3+18+1+12=86
Now find the check digit:
86mod10=6,check digit=(10−6)mod10=4
So the full ISBN-13 is 978014103614 4. Verification at the till: the scanner recomputes the weighted sum of all thirteen digits — 86+(4×1)=90 — and confirms 90mod10=0. A multiple of ten means the number is internally consistent.
The 1-3 weighting is what makes transpositions detectable. If two adjacent digits a and b are swapped, the weighted sum changes by ∣3a+b−(3b+a)∣=∣2(a−b)∣=2∣a−b∣. This is non-zero (and usually shifts the total off a multiple of ten) whenever a=b, so most adjacent swaps are caught. A plain (un-weighted) checksum would be blind to transpositions because a+b=b+a — reordering leaves the sum identical. The weighting deliberately breaks that symmetry. (The ISBN-13 scheme misses a transposition only in the special case where the two swapped digits differ by exactly 5, since 2×5=10 leaves the total a multiple of ten — a worthwhile nuance for a top-band answer.)
The older 10-digit ISBN used a modulo-11 scheme with weights running 10, 9, 8, …, 2 across the first nine digits. Because a mod-11 result can be 10, the check digit is allowed to be the symbol X (representing 10).
Take nine digits 030640615. Weight them 10 down to 2:
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.