You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
To do useful computing we need negative numbers and fractional numbers, not just the non-negative integers of the previous lessons. This lesson covers how computers represent signed integers using two's complement — the universal standard — and how fixed-point binary extends place value to the right of a binary point to represent fractions.
This lesson addresses the H446 1.4.1 Data Types content on signed and fractional representation:
(This is a paraphrase of the specification content, not a verbatim quotation.)
Binary on its own has no minus sign, so a convention is needed to mark a value as negative. The challenge is that whatever convention we choose has to do two jobs at once: it must let us tell positive from negative, and it must let ordinary binary arithmetic still produce correct answers without expensive special cases. Three schemes have been used historically; A-Level expects you to know why the last one won, and the deciding factor in every case is how cleanly arithmetic falls out of the representation.
| Method | How a negative is formed | Drawback |
|---|---|---|
| Sign and magnitude | MSB is a sign flag (0 = +, 1 = −); the rest is the magnitude | Two zeros (+0 and −0); arithmetic needs special-casing |
| One's complement | Flip all bits of the positive value | Two zeros again; addition needs an "end-around carry" |
| Two's complement | Flip all bits and add 1 | Range is slightly asymmetric — a small price |
Two's complement is the standard in essentially every modern processor because:
To see why the other two schemes fell out of favour, it helps to make the "two zeros" problem concrete. In sign and magnitude with 8 bits, 00000000 means +0 and 10000000 means −0: two different bit patterns for the same value. That wastes a code, but worse, it forces the hardware to treat +0 and −0 as equal in comparisons — a special case that complicates every circuit that tests for zero. Sign and magnitude also breaks ordinary addition: adding +5 and −3 cannot be done by simply adding the bit patterns, because the sign bits would interfere; the hardware must first inspect the signs, decide whether to add or subtract the magnitudes, and work out the sign of the result. One's complement (flip all bits) removes some of this pain but still has +0 (00000000) and −0 (11111111), and it requires an awkward "end-around carry" — any carry off the top must be added back in at the bottom — to get additions right. Two's complement quietly fixes all of this: one zero, a plain binary adder, and no end-around carry. That combination of simplicity and correctness is why it became universal.
The elegant idea behind two's complement is a tiny change to place value: in an n-bit number, the most significant bit carries a negative weight, while every other bit keeps its usual positive weight. For 8 bits the place values are therefore:
−128b764b632b516b48b34b22b11b0
So the value of an 8-bit two's complement number is:
N=−128b7+64b6+32b5+16b4+8b3+4b2+2b1+1b0
The MSB doubles as a sign bit: if b7=0 the number is non-negative; if b7=1 the −128 term dominates and the number is negative. Nothing else about reading the number changes — you still just add up the place values where there is a 1.
Why does giving the top bit a negative weight produce a consistent signed system rather than an arbitrary one? Think of an 8-bit counter that wraps round at 256. The patterns 0 through 127 sit where you expect. The pattern 10000000 would be 128 as unsigned, but it is also "127, then count one more and wrap" — and in two's complement we choose to read it as −128 instead. Every pattern from 10000000 to 11111111 is reinterpreted as "256 less than its unsigned value": 255→−1, 254→−2, and so on. Subtracting 256 from the top half is exactly what assigning the MSB a weight of −128 (instead of +128) does — −128+(the rest)=(unsigned value)−256. This is why two's complement arithmetic is just ordinary binary addition with the carry discarded: the wrap-around at 2n is the subtraction of 256 that turns large unsigned values into the negatives we want.
For an n-bit two's complement number:
minimum=−2n−1,maximum=2n−1−1
| Bits | Minimum | Maximum |
|---|---|---|
| 8 | -128 | +127 |
| 16 | -32,768 | +32,767 |
| 32 | -2,147,483,648 | +2,147,483,647 |
The range is asymmetric — one extra negative value — because zero is grouped with the positives. With 2n patterns, exactly half (2n−1) have the sign bit set and are negative, leaving 2n−1 for zero and the positives; that is why the largest positive is 2n−1−1 but the smallest negative is −2n−1.
Convert to binary exactly as for unsigned integers; the MSB will naturally be 0. For example, +45 in 8-bit two's complement is simply 00101101. The only caveat is that the value must fit below the positive maximum — +45 is fine, but +200 cannot be stored in 8 bits as a signed value because it exceeds +127.
73=010010012 flip→10110110 add 1→10110111
So −73=10110111. Verify using the signed place values:
| -128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 |
|---|---|---|---|---|---|---|---|
| 1 | 0 | 1 | 1 | 0 | 1 | 1 | 1 |
−128+32+16+4+2+1=−128+55=−73 Correct.
This is the case students always find surprising. 1=00000001; flip to 11111110; add 1 to get 11111111. So −1 is all ones — which makes sense from the place values: −128+64+32+16+8+4+2+1=−1. Recognising that "all ones = −1" is a fast sanity check in the exam.
The extreme negative is also instructive. 128=10000000; flip to 01111111; add 1 to get 10000000. So −128=10000000. Notice that the bit pattern is the same as the magnitude here — a consequence of −128 being the one value with no positive counterpart in 8 bits. Trying to form +128 the same way would require a value beyond +127, confirming that +128 simply has no 8-bit signed representation. This is the asymmetry of the range made concrete: −128 exists, +128 does not.
If the MSB is 0 the number is positive; read it normally. If the MSB is 1, add the place values including the negative −128.
| -128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 |
|---|---|---|---|---|---|---|---|
| 1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 |
−128+64+32+2=−30
If the MSB is 1, flip all bits, add 1, read the magnitude, and remember it was negative:
11100010flip00011101+100011110=30 ⇒ −30
Both methods agree. Method 1 is usually quicker; Method 2 is a good independent check and the one to use if the negative place value confuses you.
The headline payoff is that the same addition algorithm works for any combination of signs — no special handling of the sign bit is required.
25=00011001; −10=11110110 (check: flip 00001010→11110101, add 1 →11110110).
| Binary (8-bit) | Denary | |
|---|---|---|
| A | 00011001 | 25 |
| B | 11110110 | -10 |
| A + B (carry-out discarded) | 00001111 | 15 |
A carry leaves the MSB and is discarded; the stored result 00001111=15. Correct — and notice we never inspected the signs, we just added and dropped the carry.
Adding two negatives should give a more-negative result. −40=11011000 (check: 40=00101000, flip →11010111, add 1 →11011000); −20=11101100.
| Binary (8-bit) | Denary | |
|---|---|---|
| A | 11011000 | -40 |
| B | 11101100 | -20 |
| A + B (carry-out discarded) | 11000100 | -60 |
A carry leaves the MSB and is discarded as usual. The result 11000100 has its sign bit set, so it is negative; reading the place values, −128+64+4=−60. Correct. This example matters because it shows the algorithm is genuinely sign-agnostic — adding two negatives "just works", provided the true answer stays inside the range. (Here −60 is comfortably above the minimum −128, so there is no overflow.)
Because the range is bounded, an addition can produce a result that no longer fits. In two's complement this shows up as an impossible change of sign:
(A positive plus a negative can never overflow.) Equivalently — and this is the rule hardware actually uses — overflow has occurred if the carry into the MSB differs from the carry out of the MSB.
| Binary (8-bit) | Denary | |
|---|---|---|
| A | 01100100 | 100 |
| B | 00110010 | 50 |
| A + B | 10010110 | -106 (overflow; true answer 150) |
Two positives have produced a negative-looking result, so this is overflow: the true answer 150 exceeds the signed maximum of +127. The stored value 10010110 is silently wrong, which is exactly why a processor raises an overflow flag here.
It is worth dwelling on why the carry-out test fails for signed numbers. In this addition there is no carry out of the MSB at all, yet overflow has clearly occurred — so "carry-out means overflow" would wrongly report success. Conversely, in the earlier 25+(−10) example a carry did leave the MSB, yet there was no overflow. The two situations are opposite, which proves the carry-out alone tells you nothing about signed overflow. The correct signed test — carry-in to the MSB differs from carry-out of the MSB — works because overflow is precisely the case where the sign bit is forced to flip against the direction the magnitudes demand. Practically, the simplest reliable check in an exam is the sign-pattern rule: same-signed operands giving a result of the opposite sign means overflow.
Exam Tip: Detect two's complement overflow by the carry-in/carry-out test at the MSB, or by spotting the impossible sign change. Do not use "was there a carry-out of the MSB?" — that test is for unsigned overflow and gives the wrong verdict for signed arithmetic.
Fixed-point binary represents fractional numbers by extending place value to the right of a binary point. Bits to the left are whole-number place values; bits to the right are fractional place values that are negative powers of 2. The principle is exactly the same as the denary decimal point — where the columns after the point are tenths, hundredths, thousandths — except the columns here halve each time (halves, quarters, eighths, sixteenths) because the base is 2. Everything you already know about positional notation carries straight over; only the base changes.
For a format with 4 whole bits and 4 fractional bits (written "4.4"):
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.