OCR A-Level Computer Science: Data Representation — Complete Revision Guide

Everything a computer stores or processes is ultimately a pattern of bits, and data representation is the topic that explains how those bits come to mean a number, a character, an image, a sound or a secret message. It is the most calculation-heavy area of OCR A-Level Computer Science (H446): once you can convert fluently between binary, denary and hexadecimal, add and subtract binary numbers, represent negatives with two's complement, normalise a floating-point value, and reason about how images and audio are sampled and compressed, a large bank of reliably scorable marks opens up. This module rewards practised technique over memorised facts.

In the H446 specification this material draws together two areas: data types and representation (module 1.4.1) and the exchanging and protecting of data, including compression, encryption and error checking. It is examined in Component 01: Computer Systems, where the questions are dominated by "show your working" conversions and arithmetic, alongside explanation items on how sampling parameters affect file size and quality, why a particular compression method suits a particular data type, and how encryption and error-detection schemes work. Accuracy and clear working are everything here — a correct method with one arithmetic slip still earns method marks, but only if the working is laid out.

This is Course 3 of 11 on the LearningBro OCR A-Level Computer Science learning path. The course, Data Representation, opens with number systems and binary arithmetic, develops two's complement, fixed-point and floating-point representations, then covers how characters, images and sound are encoded, before closing on compression, encryption and error detection. It builds on the hardware of Processors & Hardware and underpins the logic-circuit arithmetic of Boolean Algebra & Logic.

Guide Overview

The Data Representation course is built as ten lessons that move from number systems and binary arithmetic through signed and fractional representations, into the encoding of characters, images and sound, then close on compression, encryption and error detection.

Number Systems

The number systems lesson establishes the three bases H446 works in — denary (base 10), binary (base 2) and hexadecimal (base 16) — and the conversions between them. Binary is how the hardware stores data; hexadecimal is a compact human-readable shorthand in which each hex digit maps exactly onto four binary bits (a nibble), which is why memory addresses, colour codes and machine code are so often written in hex.

The conversions to drill are denary to binary (repeated division by two, or subtracting place values), binary to denary (summing the place values of the set bits), and the binary-to-hex grouping in nibbles that makes hex conversion almost instant. The key fact that pays off repeatedly is the place-value table — 128, 64, 32, 16, 8, 4, 2, 1 for an 8-bit number — and the relationship that n bits represent 2^n distinct values. This vocabulary of bases is the substrate for every later lesson in the module, and the nibble-to-hex mapping reappears directly in the colour codes of image representation.

Denary	Binary (4-bit)	Hex
0	0000	0
5	0101	5
10	1010	A
15	1111	F

Binary Arithmetic

The binary arithmetic lesson develops addition and shifting on binary numbers. Binary addition follows the same column method as denary, with the rules that 1 + 1 = 10 (write 0, carry 1) and 1 + 1 + 1 = 11 (write 1, carry 1). The examinable hazard is overflow: when the result of adding two numbers needs more bits than the register holds, the carry out of the most significant bit is lost and the stored answer is wrong — a condition the processor must detect and flag.

The lesson also covers binary shifts. A logical shift left by one place multiplies an unsigned value by two (a zero enters at the right); a shift right by one place divides by two (bits fall off the right, and for an arithmetic right shift the sign bit is preserved). Being able to state the multiply/divide effect of a shift, and to recognise the rounding that a right shift causes when a bit is lost, is a frequent short-answer requirement. These operations connect forward to the hardware that performs them — the adders built in Boolean Algebra & Logic.

Two's Complement and Fixed Point

The two's complement and fixed point lesson develops how negative numbers and fractions are stored. Two's complement is the standard scheme for signed integers: the most significant bit carries a negative place value, so an 8-bit two's complement number represents values from -128 to +127. To negate a number you invert every bit and add one; to subtract, you add the two's complement of the number being subtracted. The advantages examiners expect are that there is a single representation of zero and that addition and subtraction use the same circuitry — which is exactly why the hardware adders of the Boolean module work for signed values without modification.

Fixed-point binary extends binary to fractions by placing an implied binary point at a fixed position, with bits to its right carrying place values of 1/2, 1/4, 1/8 and so on. Fixed point is simple and gives consistent precision across its range, but its range and precision are both limited by the fixed split between the integer and fractional parts — the limitation that motivates the floating-point representation covered next. Practise converting denary fractions to fixed-point binary and back, and negating values in two's complement, until both are automatic.

Floating Point

The floating point lesson develops the representation that trades some precision for a far wider range, mirroring scientific notation in binary. A floating-point number is stored as a mantissa (the significant digits, a signed fraction in two's complement) and an exponent (a signed power of two, also in two's complement) that says where the binary point sits. The value is the mantissa multiplied by two raised to the exponent.

The central skill is normalisation: adjusting the mantissa and exponent so the mantissa's most significant bits are the standard form (for a positive number, 0.1...; for a negative number in two's complement, 1.0...), which maximises the precision available in the fixed number of mantissa bits. The examinable trade-off, which carries marks in extended answers, is that allocating more bits to the mantissa increases precision but reduces the range, while allocating more bits to the exponent increases the range but reduces precision for a fixed total word length. Be ready to convert a denary value to a normalised floating-point pattern, convert a pattern back to denary, and explain the precision/range trade-off in context.

Character Encoding

The character encoding lesson develops how text is represented as numbers through agreed character sets. ASCII in its common form uses 7 bits to encode 128 characters — the upper- and lower-case letters, digits, punctuation and control codes — which is sufficient for English text but cannot cover the world's writing systems. Unicode was introduced to give every character in every script a unique code point, using variable-width encodings such as UTF-8 (which is backward-compatible with ASCII for the first 128 code points) to represent a vastly larger character set.

Two facts recur in questions. First, the ordering of codes is deliberate: digits are contiguous and the alphabet is contiguous, so arithmetic on character codes works predictably (and a fixed gap separates corresponding upper- and lower-case letters). Second, the trade-off between ASCII and Unicode is range versus storage — Unicode represents far more characters but a character may take more than one byte. The recurring exam pattern is to compute the storage for a string given the bits per character, or to explain why Unicode was needed.

Image Representation

The image representation lesson develops how bitmap images are stored as a grid of pixels, each pixel a binary colour value. Three properties are examined. Resolution is the number of pixels (often given as width by height); colour depth (or bit depth) is the number of bits used per pixel, which fixes the number of distinct colours as 2 raised to the colour depth; and metadata is the additional stored information (dimensions, colour depth, format) needed to reconstruct the image correctly.

The standard calculation is image file size: number of pixels multiplied by colour depth in bits, converted to bytes. Increasing resolution or colour depth improves image quality but increases file size proportionally — the trade-off that motivates the compression covered later. The lesson links the four-bits-per-hex-digit fact from the number systems lesson to hexadecimal colour codes, where a 24-bit colour is written as six hex digits (two each for red, green and blue). Practise the file-size calculation and the colour-depth-to-colour-count relationship until both are automatic.

Sound Representation

The sound representation lesson develops how a continuous analogue sound wave is captured as discrete binary samples. Sampling measures the amplitude of the wave at regular intervals; the sample rate (samples per second, in hertz) sets how often this happens, and the sample resolution (bit depth, bits per sample) sets how precisely each amplitude is recorded. A higher sample rate captures higher-frequency detail and a higher sample resolution captures amplitude more accurately, both improving fidelity at the cost of a larger file.

The calculation to master is sound file size: sample rate multiplied by sample resolution multiplied by duration (and by the number of channels for stereo). The conceptual point examiners reward is the link between sampling parameters and the faithfulness of the reconstruction — too low a sample rate loses high-frequency content, too low a resolution introduces quantisation error. As with images, the quality-versus-size trade-off here is the direct motivation for the compression techniques in the next lesson.

Data Compression

The data compression lesson develops the two families of technique for reducing file size, and crucially when each is appropriate. Lossless compression reduces file size without discarding any information, so the original is perfectly reconstructable; H446 examines two methods. Run-length encoding (RLE) replaces runs of repeated values with a single value and a count, which works well on data with long runs (such as simple graphics) but can enlarge data with little repetition. Dictionary coding replaces recurring patterns with shorter codes held in a dictionary, suiting text with repeated words or phrases.

Lossy compression achieves much greater size reductions by permanently discarding information judged least perceptible — appropriate for photographs, audio and video where a perfect reconstruction is unnecessary, but unacceptable for text, program code or any data where every bit matters. The examinable skill is selection and justification: lossless for text and executables where fidelity is essential; lossy for media where the size saving outweighs imperceptible quality loss; RLE for runs, dictionary coding for repeated patterns. Be ready to work a small RLE example by hand and to argue which method fits a given file type.

Method	Lossless?	Best for	Weakness
Run-length encoding	Yes	Long runs of repeats (simple images)	Enlarges low-repetition data
Dictionary coding	Yes	Text with repeated patterns	Dictionary overhead
Lossy	No	Photos, audio, video	Irreversible quality loss

Encryption

The encryption lesson develops how data is protected from unauthorised reading, distinguishing the two cryptographic models. Symmetric encryption uses a single shared key for both encryption and decryption; it is fast and efficient but requires the key to be exchanged securely, which is itself the hard problem. Asymmetric (public-key) encryption uses a mathematically linked key pair — a public key that anyone may use to encrypt, and a private key kept secret that alone can decrypt — which solves the key-distribution problem because the public key can be shared openly.

The lesson also introduces the headline application: secure communication over a network, the foundation for protocols revisited in Networks. It distinguishes encryption from related ideas it is easy to confuse — encryption scrambles data so only an authorised party can read it, whereas hashing produces a fixed-length fingerprint that cannot be reversed. The examinable points are the symmetric/asymmetric distinction, why asymmetric encryption solves key exchange, and the appropriate use of each.

Error Detection

The error detection lesson develops the techniques that catch corruption introduced when data is transmitted or stored. A parity bit adds a single bit set to make the total number of 1s even (even parity) or odd (odd parity); a single-bit error flips the count and is detected, though two simultaneous errors cancel and pass undetected. A checksum computes a value from the data block, sends it alongside, and recomputes it on receipt; a mismatch signals corruption. A check digit (such as those in barcodes and ISBNs) is a digit calculated from the others to validate manually entered data.

The lesson also covers majority voting and the use of parity blocks to locate as well as detect an error, and it connects error checking to the wider data-exchange theme alongside the encryption lesson. The examinable distinction is between schemes that merely detect an error and those that can also locate or correct one, and the limitations of each — single parity detecting only odd numbers of bit errors being the classic example. Being able to compute a parity bit or a simple check digit and explain what the scheme will and will not catch is the core of the marks.

Worked Example: Converting Between Bases

Base conversion is the bread-and-butter of this module, so here are the three directions you must execute fluently, each with the working laid out as the examiner expects to see it.

Denary to binary (8-bit), converting 154. Work down the place-value table $128, 64, 32, 16, 8, 4, 2, 1$ , subtracting each place value that fits and writing a 1, otherwise writing a 0:

154 - 128 = 26   → 1 (128)
 26 - 64          → 0 (64 too big)
 26 - 32          → 0 (32 too big)
 26 - 16 = 10    → 1 (16)
 10 - 8  = 2     → 1 (8)
  2 - 4           → 0 (4 too big)
  2 - 2  = 0     → 1 (2)
  0 - 1           → 0 (1)
154 = 1001 1010

Binary to denary, converting 1001 1010. Sum the place values of the set bits: $128 + 16 + 8 + 2 = 154$ . This is the inverse of the process above and a quick self-check.

Binary to hexadecimal, converting 1001 1010. Split into nibbles and convert each: $1001_2 = 9$ and $1010_2 = A$ , so $1001\,1010_2 = 9\text{A}_{16}$ . Because each hex digit maps onto exactly four bits, this grouping makes hex conversion almost instantaneous — which is precisely why memory addresses and colour codes are written in hex. Going the other way, each hex digit expands to its four-bit nibble, so $\text{2F}_{16} = 0010\,1111_2 = 47_{10}$ .

The habit that prevents errors is to always write the place-value header ( $128, 64, 32, 16, 8, 4, 2, 1$ ) above your bits before you start. Most conversion mistakes are place-value slips, and a written header eliminates them.

Worked Example: Binary Addition and Overflow

Add the 8-bit unsigned numbers $0110\,1101$ (109) and $0101\,0011$ (83). Work right to left, carrying where a column sums to 2 or 3:

    0110 1101      (109)
  + 0101 0011      ( 83)
  -----------
    1100 0000      (192)

Checking, $109 + 83 = 192$ , and $1100\,0000_2 = 128 + 64 = 192$ , so the result is correct and fits in 8 bits. Now consider adding $1000\,0000$ (128) and $1000\,0000$ (128): the true sum is 256, which needs nine bits, but the register holds only eight. The carry out of the most significant bit is lost, the stored result is $0000\,0000$ , and the processor must raise an overflow flag to signal the answer is invalid. Being able to identify when overflow occurs — the carry leaves the most significant column and there is nowhere to put it — and to explain the consequence is a dependable short-answer mark.

Binary shifts are the other examinable arithmetic operation. A logical shift left by one place inserts a 0 at the right and multiplies an unsigned value by two: $0000\,0110$ (6) becomes $0000\,1100$ (12). A logical shift right by one place divides by two, discarding the bit that falls off the right: $0000\,0110$ (6) becomes $0000\,0011$ (3). The rounding hazard is that a right shift of an odd number loses the units bit — $0000\,0111$ (7) shifted right becomes $0000\,0011$ (3), effectively rounding down. For signed values, an arithmetic right shift preserves the sign bit rather than inserting a 0, keeping negative numbers negative.

Worked Example: Two's Complement Negation and Subtraction

To find the 8-bit two's complement representation of $-40$ , start from $+40 = 0010\,1000$ , invert every bit to get $1101\,0111$ , then add one:

  0010 1000      (+40)
  1101 0111      (invert every bit)
+        1       (add one)
  ---------
  1101 1000      (-40)

Check by reading the place values with the most significant bit negative: $-128 + 64 + 16 + 8 = -40$ . Subtraction then becomes addition of the two's complement, which is exactly why the same adder hardware handles both. To compute $50 - 40$ , add $+50$ to $-40$ :

  0011 0010      (+50)
+ 1101 1000      (-40)
  ---------
1 0000 1010      (result 0000 1010 = +10; the carry out of bit 7 is discarded)

The stored eight bits are $0000\,1010 = +10$ , which is correct, and the carry out of the sign column is simply discarded (it is not an overflow here because the operands had opposite signs). The two advantages examiners reward are that two's complement gives a single representation of zero (unlike sign-and-magnitude, which has $+0$ and $-0$ ) and that addition and subtraction share one circuit. A useful sanity check: if the sign bit of a result of adding two same-signed numbers differs from the operands' sign bit, genuine overflow has occurred.

Worked Example: Floating-Point Normalisation

Floating point mirrors scientific notation in binary: a value is a signed mantissa multiplied by two raised to a signed exponent, both stored in two's complement. Normalisation adjusts the pair so the mantissa's most significant bits are in standard form — for a positive number, $0.1\ldots$ ; for a negative number in two's complement, $1.0\ldots$ — which packs the maximum precision into a fixed number of mantissa bits.

Suppose a format uses an 8-bit mantissa with the binary point after the first bit and a 4-bit exponent, and you want to represent the value $6.5$ (denary). In binary, $6.5 = 110.1$ . To normalise, shift the point so the mantissa reads $0.1101$ and record how far you shifted: moving the point three places left multiplies the mantissa by $2^{-3}$ , so to keep the value the same the exponent must be $+3$ . The stored form is therefore a mantissa of $0.110\,1000$ (padded to width) with an exponent of $0011$ ( $+3$ ), representing $0.1101_2 \times 2^{3} = 110.1_2 = 6.5$ . Reversing the process — taking a stored pattern back to denary — means reading the mantissa as a fraction and shifting the point by the exponent.

The extended-answer trade-off, worth learning as a sentence you can deploy, is that for a fixed total word length, giving more bits to the mantissa increases precision but shrinks the range, while giving more bits to the exponent widens the range but reduces precision. More mantissa bits mean more significant figures; more exponent bits mean the point can move further, reaching larger and smaller magnitudes. A common exam task is to explain which allocation you would choose for a given application — scientific data spanning many orders of magnitude wants exponent bits; financial values needing many significant figures want mantissa bits.

Worked Example: Image and Sound File-Size Calculations

The file-size calculations are near-guaranteed marks if you keep the units straight. For a bitmap image, the raw size in bits is $\text{width} \times \text{height} \times \text{colour depth}$ . Take a $100 \times 100$ -pixel image at a colour depth of 24 bits per pixel:

pixels     = 100 * 100          # 10,000 pixels
bits       = pixels * 24        # 240,000 bits
bytes      = bits / 8           # 30,000 bytes
kibibytes  = bytes / 1024       # ~29.3 KiB

Note the colour-depth relationship that partners this: a colour depth of $n$ bits gives $2^{n}$ distinct colours, so 24-bit "true colour" offers $2^{24}$ (about 16.7 million) colours, written as six hexadecimal digits — two each for red, green and blue, exactly the nibble-to-hex mapping from the number-systems lesson.

For sound, the raw size in bits is $\text{sample rate} \times \text{sample resolution} \times \text{duration} \times \text{channels}$ . Take a 10-second stereo clip sampled at 44,100 Hz with 16 bits per sample:

samples = 44100 * 10           # 441,000 samples per channel
bits    = samples * 16 * 2     # 14,112,000 bits (× 2 for stereo)
bytes   = bits / 8             # 1,764,000 bytes
mebibytes = bytes / (1024*1024)  # ~1.68 MiB

The conceptual point examiners attach to these calculations is the quality-versus-size trade-off: doubling the sample rate or the colour depth roughly doubles the file, buying higher fidelity (finer amplitude steps, more colours, higher-frequency capture) at proportional storage cost — which is exactly the pressure that motivates compression.

Worked Example: Run-Length Encoding by Hand

Run-length encoding is the lossless method you are most likely to have to execute rather than merely describe. Encode the pixel run WWWWWWBBBWWWWWWWWWB (a simple monochrome line). Replace each run of identical values with a (value, count) pair:

Original:  W W W W W W  B B B  W W W W W W W W W  B
RLE:       6W  3B  9W  1B

Nineteen symbols become four (value, count) pairs — a clear saving. The examinable caveat is that RLE only helps when data contains long runs: on data with little repetition, such as a photograph or already-compressed text, each pair may describe a run of length one, and the encoded output can be larger than the original. That is why RLE suits simple graphics and icons, while dictionary coding — replacing recurring patterns (whole words or phrases) with short codes held in a dictionary — suits text. Both are lossless: the original is perfectly reconstructable. Lossy compression, by contrast, permanently discards the least perceptible information for far larger savings, which is acceptable for photos, audio and video but never for text, program code or any data where every bit matters.

Common Mistakes and How to Avoid Them

Dropping place values in conversions. Always write the header $128, 64, 32, 16, 8, 4, 2, 1$ first; most binary-to-denary slips are a mis-aligned column.
Forgetting the "+1" in two's complement. Negation is invert and add one — inverting alone gives ones' complement, which is not what OCR examines. Miss the increment and every negative value is off by one.
Confusing overflow with a discarded carry. When you add a positive and a negative two's complement number, the carry out of the sign bit is discarded and the answer is still correct. Genuine overflow only occurs when two same-signed numbers produce a result whose sign bit is wrong.
Bits-versus-bytes unit errors. File-size questions often give bits per pixel or per sample but ask for the answer in bytes, KiB or MiB. Divide by 8 for bytes, then by 1024 per further step, and state the unit.
Under-normalising a floating-point mantissa. The mantissa must start $0.1\ldots$ (positive) or $1.0\ldots$ (negative); leaving a leading $0.01\ldots$ wastes precision and loses the normalisation marks.
Claiming RLE always compresses. It can enlarge low-repetition data. The mark is in recognising when it helps.
Muddling encryption and hashing. Encryption is reversible with the right key; hashing produces a one-way fixed-length fingerprint that cannot be reversed. Do not describe hashing as "encryption".

Exam Technique for Data Representation

This is the most calculation-heavy module in H446, and its marks reward laid-out working above all. For every conversion or arithmetic item, show each step — the place-value subtraction, the nibble split, the invert-and-add-one — because a correct method with a single slip still earns method marks, whereas a bare wrong answer earns nothing. Read the units in the question and the units demanded in the answer before you start a file-size calculation; a perfectly correct bit count marked in the wrong unit loses the final mark.

Match your answer to the command word. Convert and calculate want the working shown. Explain — why Unicode was needed, why asymmetric encryption solves key exchange, why a right shift can lose precision — wants reasoning with consequences, not a one-word label. Compare or justify items (lossless versus lossy, symmetric versus asymmetric, detect-only versus correct) want a two-sided answer anchored on the discriminator and the "use it when". For those conceptual questions, a one-line discriminator plus a one-line use-case is enough to score reliably, so learn each trade-off as that compact pair rather than as loose prose.

Mini-FAQ

How many bits do I need to represent a given number of values? $n$ bits represent $2^{n}$ distinct values, so to represent $V$ values you need the smallest $n$ with $2^{n} \geq V$ . For 200 colours you need 8 bits ( $2^{8} = 256 \geq 200$ ; 7 bits gives only 128). This single relationship underlies colour depth, character-set size and address-bus width, so learn it cold.

What is the range of an 8-bit two's complement number? $-128$ to $+127$ . The most significant bit carries the negative place value $-128$ ; the remaining seven bits add up to at most $+127$ . In general an $n$ -bit two's complement number ranges from $-2^{n-1}$ to $+2^{n-1}-1$ , and there is exactly one representation of zero.

Why does hexadecimal appear everywhere in computing? Because one hex digit maps onto exactly four binary bits, hex is a compact, human-readable shorthand for long binary strings — far less error-prone to read and transcribe than raw binary. Memory addresses, machine code and 24-bit colour codes (six hex digits) are all written in hex for this reason.

When is lossy compression acceptable and when is it not? Lossy compression is appropriate for photographs, audio and video, where discarding imperceptible detail buys large size savings and a perfect reconstruction is unnecessary. It is never acceptable for text, program source code, executables or any data where every bit must survive unchanged — for those you must use a lossless method such as run-length or dictionary coding.

What is the difference between sample rate and sample resolution? Sample rate is how often the amplitude is measured (samples per second, in hertz) and governs the highest frequency that can be captured; sample resolution is how precisely each measurement is stored (bits per sample) and governs how accurately each amplitude is recorded, with too few bits introducing quantisation error. Both raise fidelity at the cost of file size, and both appear as factors in the sound file-size formula.

How to Revise Data Representation

Data representation rewards drilling over reading more than any other H446 module, because almost every mark is earned by executing a technique correctly with clear working. Build a personal worksheet that cycles through the core conversions and calculations — denary/binary/hex conversion, binary addition with overflow, two's complement negation and subtraction, fixed-point and floating-point conversion with normalisation, and the file-size calculations for images and sound — and work a few of each every revision session until the methods are automatic and the working lays itself out without thought.

For the conceptual half of the module — character sets, compression, encryption and error detection — anchor each on its examinable distinction and trade-off: ASCII versus Unicode (range versus storage), lossless versus lossy and RLE versus dictionary coding (fidelity versus size, by data type), symmetric versus asymmetric encryption (speed versus key distribution), and detect-only versus locate/correct error schemes. A one-line discriminator plus a one-line "use it when" for each is enough to handle the justification questions reliably.

Start at the Data Representation course and work through all ten lessons in order, from number systems to error detection. Once the conversions and calculations are fluent, the binary arithmetic feeds straight into the hardware that performs it in Boolean Algebra & Logic and the encryption and error-checking content connects forward to Networks on the OCR A-Level Computer Science path.

OCR A-Level Computer Science: Data Representation — Complete Revision Guide

Guide Overview

Number Systems

Denary	Binary (4-bit)	Hex
0	0000	0
5	0101	5
10	1010	A
15	1111	F

Binary Arithmetic

Two's Complement and Fixed Point

Floating Point

Character Encoding

Image Representation

Sound Representation

Data Compression

Method	Lossless?	Best for	Weakness
Run-length encoding	Yes	Long runs of repeats (simple images)	Enlarges low-repetition data
Dictionary coding	Yes	Text with repeated patterns	Dictionary overhead
Lossy	No	Photos, audio, video	Irreversible quality loss

Encryption

Error Detection

Worked Example: Converting Between Bases

Base conversion is the bread-and-butter of this module, so here are the three directions you must execute fluently, each with the working laid out as the examiner expects to see it.

Denary to binary (8-bit), converting 154. Work down the place-value table $128, 64, 32, 16, 8, 4, 2, 1$ , subtracting each place value that fits and writing a 1, otherwise writing a 0:

154 - 128 = 26   → 1 (128)
 26 - 64          → 0 (64 too big)
 26 - 32          → 0 (32 too big)
 26 - 16 = 10    → 1 (16)
 10 - 8  = 2     → 1 (8)
  2 - 4           → 0 (4 too big)
  2 - 2  = 0     → 1 (2)
  0 - 1           → 0 (1)
154 = 1001 1010

Binary to denary, converting 1001 1010. Sum the place values of the set bits: $128 + 16 + 8 + 2 = 154$ . This is the inverse of the process above and a quick self-check.

Worked Example: Binary Addition and Overflow

Add the 8-bit unsigned numbers $0110\,1101$ (109) and $0101\,0011$ (83). Work right to left, carrying where a column sums to 2 or 3:

    0110 1101      (109)
  + 0101 0011      ( 83)
  -----------
    1100 0000      (192)

Worked Example: Two's Complement Negation and Subtraction

To find the 8-bit two's complement representation of $-40$ , start from $+40 = 0010\,1000$ , invert every bit to get $1101\,0111$ , then add one:

  0010 1000      (+40)
  1101 0111      (invert every bit)
+        1       (add one)
  ---------
  1101 1000      (-40)

  0011 0010      (+50)
+ 1101 1000      (-40)
  ---------
1 0000 1010      (result 0000 1010 = +10; the carry out of bit 7 is discarded)

Worked Example: Floating-Point Normalisation

Worked Example: Image and Sound File-Size Calculations

pixels     = 100 * 100          # 10,000 pixels
bits       = pixels * 24        # 240,000 bits
bytes      = bits / 8           # 30,000 bytes
kibibytes  = bytes / 1024       # ~29.3 KiB

samples = 44100 * 10           # 441,000 samples per channel
bits    = samples * 16 * 2     # 14,112,000 bits (× 2 for stereo)
bytes   = bits / 8             # 1,764,000 bytes
mebibytes = bytes / (1024*1024)  # ~1.68 MiB

Worked Example: Run-Length Encoding by Hand

Original:  W W W W W W  B B B  W W W W W W W W W  B
RLE:       6W  3B  9W  1B

Common Mistakes and How to Avoid Them

Dropping place values in conversions. Always write the header $128, 64, 32, 16, 8, 4, 2, 1$ first; most binary-to-denary slips are a mis-aligned column.
Forgetting the "+1" in two's complement. Negation is invert and add one — inverting alone gives ones' complement, which is not what OCR examines. Miss the increment and every negative value is off by one.
Confusing overflow with a discarded carry. When you add a positive and a negative two's complement number, the carry out of the sign bit is discarded and the answer is still correct. Genuine overflow only occurs when two same-signed numbers produce a result whose sign bit is wrong.
Bits-versus-bytes unit errors. File-size questions often give bits per pixel or per sample but ask for the answer in bytes, KiB or MiB. Divide by 8 for bytes, then by 1024 per further step, and state the unit.
Under-normalising a floating-point mantissa. The mantissa must start $0.1\ldots$ (positive) or $1.0\ldots$ (negative); leaving a leading $0.01\ldots$ wastes precision and loses the normalisation marks.
Claiming RLE always compresses. It can enlarge low-repetition data. The mark is in recognising when it helps.
Muddling encryption and hashing. Encryption is reversible with the right key; hashing produces a one-way fixed-length fingerprint that cannot be reversed. Do not describe hashing as "encryption".

OCR A-Level Computer Science: Data Representation — Complete Revision Guide

Guide Overview

Number Systems

Binary Arithmetic

Two's Complement and Fixed Point

Floating Point

Character Encoding

Image Representation

Sound Representation

Data Compression

Encryption

Error Detection

Worked Example: Converting Between Bases

Worked Example: Binary Addition and Overflow

Worked Example: Two's Complement Negation and Subtraction

Worked Example: Floating-Point Normalisation

Worked Example: Image and Sound File-Size Calculations

Worked Example: Run-Length Encoding by Hand

Common Mistakes and How to Avoid Them

Exam Technique for Data Representation

Mini-FAQ

How to Revise Data Representation

Related Reading

Stay in the loop

OCR A-Level Computer Science: Data Representation — Complete Revision Guide

Guide Overview

Number Systems

Binary Arithmetic

Two's Complement and Fixed Point

Floating Point

Character Encoding

Image Representation

Sound Representation

Data Compression

Encryption

Error Detection

Worked Example: Converting Between Bases

Worked Example: Binary Addition and Overflow

Worked Example: Two's Complement Negation and Subtraction

Worked Example: Floating-Point Normalisation

Worked Example: Image and Sound File-Size Calculations

Worked Example: Run-Length Encoding by Hand

Common Mistakes and How to Avoid Them

Exam Technique for Data Representation

Mini-FAQ

How to Revise Data Representation

Related Reading

Stay in the loop