AQA A-Level Computer Science: Data Representation — Complete Revision Guide (7517)
AQA A-Level Computer Science: Data Representation — Complete Revision Guide (7517)
Data representation is the topic that explains how every kind of information — numbers, text, images, sound — is reduced to patterns of bits that a computer can store and process. In the AQA A-Level Computer Science (7517) specification this content sits in spec area 4.5 (Fundamentals of data representation), and this guide also covers the closely linked spec area 4.8 (the moral, ethical, legal and cultural consequences of computing), because the LearningBro course pairs the two: once you understand how data is represented and moved, you are equipped to reason about the legal and ethical framework that governs its use. It is a topic that rewards methodical, accurate working — much of it is calculation under pressure — and clear, structured argument in the legislation section.
The material is examined across both written papers. Paper 1 (on-screen, 40% of the A-Level) frequently tests the calculation-heavy parts in the context of code: converting between bases, performing two's complement arithmetic, applying bitwise masks and shifts, or reasoning about how a value is stored. Paper 2 (written, also 40%) covers the broader explanatory content — floating-point representation and its trade-offs, character encoding, error detection and correction, image and sound representation, compression and encryption, and the extended-response questions on the consequences of computing and the relevant legislation. The remaining 20% is the Non-Exam Assessment (NEA) programming project; data representation is not assessed there directly, but choices about how to store and encode data, and an awareness of legal obligations around personal data, are exactly the kind of reasoning that strengthens an NEA analysis and evaluation.
This guide moves through every lesson in the LearningBro Data Representation course in a build-up order: from number systems and the units of storage, through the arithmetic and bit-level operations, to how the major media types are encoded, and finally to compression, encryption and the legal and ethical consequences of all this computing. Use it to sequence your revision and to spot the precise points where accuracy of method and exactness of legislation names and years earn the marks.
Guide Overview
- Number systems: bases and units
- Unsigned and signed binary arithmetic
- Fixed and floating-point representation
- Bitwise operations: shifts and masks
- Character encoding: ASCII and Unicode
- Error detection and correction
- Representing images
- Representing sound
- Data compression and encryption
- Consequences of computing and legislation
Number Systems, Bases and Units
Everything in this topic rests on fluency with number bases, the focus of the number systems lesson. You must convert confidently between denary (base 10), binary (base 2) and hexadecimal (base 16), in both directions. Hexadecimal matters because it is a compact, human-readable shorthand for binary — each hex digit maps to exactly four bits (a nibble), so a byte is always two hex digits, which is why memory dumps, colour codes and machine code are written in hex.
You must also be exact about the units of information. AQA expects you to know both conventions and to use the right one. A bit is a single binary digit and a byte is 8 bits. The decimal (SI) prefixes are powers of 1000 — a kilobyte is 1000 bytes, a megabyte is 1000 kilobytes, and so on through gigabyte, terabyte and petabyte. The binary prefixes are powers of 1024 — a kibibyte (KiB) is 1024 bytes, a mebibyte (MiB) is 1024 KiB, then gibibyte, tebibyte and pebibyte.
| Prefix (decimal, ×1000) | Prefix (binary, ×1024) |
|---|---|
| kilobyte (kB) = 10³ bytes | kibibyte (KiB) = 2¹⁰ bytes |
| megabyte (MB) = 10⁶ bytes | mebibyte (MiB) = 2²⁰ bytes |
| gigabyte (GB) = 10⁹ bytes | gibibyte (GiB) = 2³⁰ bytes |
| terabyte (TB) = 10¹² bytes | tebibyte (TiB) = 2⁴⁰ bytes |
The classic pitfall is mixing the conventions — answering a "how many bytes in a kibibyte?" question with 1000, or treating a kilobyte as 1024 bytes. Read which prefix the question uses and apply the matching base. A second pitfall is dropping leading zeros when converting to a fixed bit-width; if a question asks for an 8-bit answer, pad it to 8 bits.
Unsigned and Signed Binary Arithmetic
The unsigned and signed binary arithmetic lesson builds from positive-only values to representing negatives. You must add binary numbers, tracking carries, and recognise overflow — when the result of an addition needs more bits than are available, so it cannot be stored correctly in the given width. AQA expects you to spot overflow and explain its cause.
For negatives, the specification requires two's complement, which is the dominant method because it lets the same addition hardware handle subtraction and gives a single representation of zero. To form the two's complement of a number, invert every bit (one's complement) and add 1. The leftmost bit then carries a negative place value, so in an 8-bit two's complement number the most significant bit represents −128. Subtraction is performed by adding the two's complement of the number being subtracted.
| 8-bit two's complement | Denary value |
|---|---|
0111 1111 | +127 (largest positive) |
0000 0001 | +1 |
0000 0000 | 0 |
1111 1111 | −1 |
1000 0000 | −128 (most negative) |
The most common pitfall is forgetting the "+1" step and confusing one's complement with two's complement. Another is misreading the range: an 8-bit two's complement number spans −128 to +127, not −127 to +127 — there is one more negative value than positive because zero occupies a positive-side pattern. Show your inversion and your add-1 explicitly so method marks are secure even if a bit slips.
Fixed and Floating-Point Representation
To represent fractions and very large or very small numbers, the fixed and floating-point representation lesson introduces two schemes. In fixed-point representation, the binary point sits at a fixed position, with bits before it carrying whole-number place values and bits after it carrying fractional place values (½, ¼, ⅛ and so on). It is simple and fast but offers only a narrow range for a given number of bits.
Floating-point representation, the heavily examined part, splits the available bits into a mantissa (the significant digits, a signed two's complement fraction) and an exponent (a signed two's complement integer giving the power of two by which to scale the mantissa). This trades some precision for a vastly greater range. You must convert a denary value to floating-point, convert a floating-point pattern back to denary, and normalise a floating-point number. Normalisation maximises precision by adjusting the mantissa and exponent so the mantissa's most significant bits are as informative as possible — for a positive number the mantissa starts 0.1..., and for a negative number it starts 1.0....
The central concept the examiners test is the trade-off between range and precision for a fixed total number of bits: allocate more bits to the exponent and you extend the range but reduce precision; allocate more to the mantissa and you gain precision but shrink the range. You should also be able to discuss rounding errors and the absolute and relative errors that arise because many denary fractions cannot be represented exactly in binary floating-point. The defining pitfall is sign errors in the two's complement mantissa, and forgetting to keep the mantissa and exponent both in two's complement when the value is negative. Normalisation questions also trip students who move the binary point but forget to adjust the exponent by the matching amount — every shift of the point changes the exponent by one.
Bitwise Operations: Shifts and Masks
The bitwise operations lesson covers manipulating individual bits, a skill that connects directly to the logic gates studied in computer architecture. You must apply the bitwise operators AND, OR, NOT and XOR across two bit patterns, and understand logical shifts: a logical shift left moves all bits left, filling vacated positions on the right with zeros (and is equivalent to multiplying an unsigned value by 2 per place shifted); a logical shift right moves bits right, filling on the left with zeros (equivalent to integer division by 2 per place).
The most exam-relevant application is masking — using a bitwise operation with a carefully chosen mask to isolate, set, clear or test specific bits. AND with a mask isolates or clears bits (a 0 in the mask forces that bit to 0; a 1 lets it through); OR with a mask sets specific bits to 1; XOR with a mask flips (toggles) specific bits. Being able to choose the right operator and mask for a stated goal is precisely what questions reward.
| Operation | Effect |
|---|---|
| AND with mask | Clears bits where mask is 0; keeps bits where mask is 1 (isolate/test) |
| OR with mask | Sets bits to 1 where mask is 1 |
| XOR with mask | Toggles bits where mask is 1 |
| Shift left by n | Multiply unsigned value by 2ⁿ |
| Shift right by n | Integer-divide unsigned value by 2ⁿ |
A common pitfall is choosing the wrong operator for the intended effect — for example, using OR when the goal is to clear bits (which requires AND with zeros). Another is overlooking that a logical right shift on a signed value will not preserve the sign, so the "divide by two" shortcut applies cleanly to unsigned interpretation. State the mask in binary in your working so the bit-by-bit effect is visible.
Character Encoding: ASCII and Unicode
The character encoding lesson explains how text becomes binary. ASCII uses 7 bits to encode 128 characters — the English alphabet in upper and lower case, digits, punctuation and control codes. You should know practical facts examiners use: the codes for the digits and letters run in contiguous order, so the difference between the code for 'A' and 'a' is constant, which is why converting case is a simple arithmetic operation, and the character '9' is not the same as the integer 9.
ASCII's limitation is that 128 characters cannot represent the world's writing systems, which is why Unicode was developed. Unicode provides a vastly larger code space, assigning a unique code point to characters from virtually every script and to symbols and emoji, with encodings such as UTF-8 storing them efficiently. The exam-relevant contrast is straightforward: Unicode supports far more characters and global scripts at the cost of more bits per character than 7-bit ASCII, while remaining backwards-compatible with ASCII for the original character set. The standard pitfall is asserting that Unicode "uses 16 bits" as a blanket rule; describe it instead as a much larger character set with variable-width encodings, and avoid over-precise claims about fixed widths.
Error Detection and Correction
Because data is corrupted in storage and transmission, the error detection and correction lesson covers the methods that guard against it. A parity bit is an extra bit added to make the total number of 1s either even (even parity) or odd (odd parity); the receiver checks the parity and detects a single-bit error if it no longer matches. A checksum is a value calculated from a block of data and sent with it; the receiver recalculates and compares to detect corruption. Majority voting sends each bit multiple times and takes the most common value, allowing not just detection but correction of a minority error.
You should also understand a check digit (an extra digit derived from the others, used in barcodes and identification numbers to catch entry errors). The key exam distinction is between detection and correction: a single parity bit can detect an odd number of bit errors but cannot locate or correct them, whereas majority voting can correct an error by overriding the minority. The pitfall is overstating what parity achieves — it cannot detect an even number of simultaneous errors (two flipped bits leave parity unchanged) and it cannot correct anything. Be precise about the limits of each method.
Representing Images
The representing images lesson covers bitmap images, which are grids of pixels (picture elements), each storing a colour value. Two quantities determine the data and quality. Resolution is the number of pixels (often given as width × height, or as pixel density); more pixels mean more detail and a larger file. Colour depth (bit depth) is the number of bits used per pixel; n bits per pixel allow 2ⁿ distinct colours, so 1 bit gives black and white, 8 bits give 256 colours, and 24 bits give true colour.
You must calculate file size from these: file size in bits equals resolution (total pixels) multiplied by colour depth, then divide by 8 for bytes (and remember any metadata the question specifies). The specification also expects awareness of metadata — data about the image such as its dimensions, colour depth and creation details — stored alongside the pixel data. The reliable pitfall is unit confusion in the final answer: the raw calculation gives bits, so divide by 8 for bytes and apply the correct kilobyte/kibibyte convention the question uses. Another is forgetting that doubling the colour depth or doubling the resolution roughly doubles the file size, which is the basis of many "explain why the file is larger" questions.
Representing Sound
The representing sound lesson explains how a continuous analogue sound wave becomes digital through sampling. At regular intervals the wave's amplitude is measured and stored as a binary number. Two parameters govern fidelity and size. The sampling rate (sampling frequency, in hertz) is how many samples are taken per second; a higher rate captures higher-frequency detail more faithfully. The sample resolution (bit depth) is the number of bits used to record each sample's amplitude; more bits allow finer amplitude distinctions and reduce quantisation error.
You must calculate the file size of an uncompressed sound clip: sampling rate × sample resolution × duration in seconds (× number of channels where relevant) gives the size in bits, divided by 8 for bytes. The conceptual point examiners test is that higher sampling rate and higher sample resolution both improve quality but both increase file size — the same fidelity-versus-size trade-off seen with images. The standard pitfall is forgetting to multiply by the duration, or by the channel count for stereo, and the perennial bits-to-bytes division at the end. A subtler one is confusing sampling rate (samples per second, affecting frequency range) with sample resolution (bits per sample, affecting amplitude accuracy) — keep the two distinct in explanatory answers.
Data Compression and Encryption
The data compression and encryption lesson covers two ways of transforming data for transmission and storage. Compression reduces file size. Lossless compression reduces size with no loss of information, so the original is perfectly reconstructable — two methods the specification expects are run-length encoding (RLE), which replaces runs of repeated values with a single value and a count, and dictionary-based compression, which replaces recurring patterns with shorter codes held in a dictionary. Lossy compression achieves much smaller files by permanently discarding information judged less perceptible (as used for typical web images and audio), accepting a quality reduction that cannot be reversed. The exam point is matching the method to the use case: lossless where exact reconstruction matters (text, program files), lossy where smaller size justifies acceptable quality loss (streaming media).
Encryption protects confidentiality rather than reducing size. Symmetric encryption uses the same key to encrypt and decrypt — fast, but with the problem of distributing the shared key securely. Asymmetric encryption uses a public key to encrypt and a private key to decrypt, solving the key-distribution problem at greater computational cost; it also underpins digital signatures. The specification also expects knowledge of the Vernam cipher (a one-time pad using a truly random key at least as long as the message, which is theoretically unbreakable when used correctly) contrasted with ciphers whose security rests on computational difficulty. A common pitfall is conflating compression with encryption — compression makes data smaller, encryption makes it unreadable without the key; they serve entirely different purposes. Another is claiming lossless compression always beats lossy; for media, lossy achieves far higher compression ratios, which is exactly why it is used.
Consequences of Computing and Legislation
The final lesson, consequences of computing and legislation, covers spec area 4.8: the moral, ethical, legal and cultural impact of computing, and the UK legal framework. The specification expects you to discuss issues such as privacy and surveillance, the digital divide, the environmental impact of computing, the effects of automation and artificial intelligence on employment, censorship, and the spread of misinformation — and to construct balanced, reasoned arguments that weigh benefits against harms rather than asserting one-sided conclusions.
You must also know the relevant UK legislation by its correct name and year:
| Legislation | Purpose |
|---|---|
| Data Protection Act 2018 | Governs the processing of personal data (UK's implementation alongside the GDPR), setting principles and individuals' rights over their data |
| Computer Misuse Act 1990 | Criminalises unauthorised access to computer systems, unauthorised access with intent to commit further offences, and unauthorised modification of data |
| Copyright, Designs and Patents Act 1988 | Protects intellectual property — original works, software and designs — from unauthorised copying and use |
| Regulation of Investigatory Powers Act 2000 | Regulates the interception of communications and the surveillance powers of public bodies |
The exam-relevant skill is applying the right Act to a scenario — for example, recognising that gaining unauthorised access to a system engages the Computer Misuse Act 1990, that mishandling customers' personal data engages the Data Protection Act 2018, that copying software without permission engages the Copyright, Designs and Patents Act 1988, and that lawful interception of communications is governed by the Regulation of Investigatory Powers Act 2000. The most damaging pitfall is misremembering names or years; precision matters, so commit the four to memory exactly as above. The second pitfall is one-sided ethical answers — extended-response questions in this area reward a structured argument that genuinely considers more than one perspective before reaching a justified conclusion.
This topic threads through the whole qualification. The binary arithmetic and bitwise logic here connect directly to the computer architecture course and its adder and logic-gate circuits, while the data-handling and legislation awareness inform sensible, lawful design choices in the NEA. For the complete topic with worked calculations and exam-style questions, study the AQA A-Level Computer Science: Data Representation course, and place it within your wider study using the A-Level Computer Science (AQA) learning path.
Next Steps
Secure the calculation-heavy lessons first — base conversion, two's complement arithmetic, floating-point normalisation, bitwise masking, and image and sound file-size calculations — because these are the most reliable marks and they reward shown working under timed conditions. Then learn the four pieces of legislation exactly by name and year and practise applying each to a scenario, alongside building balanced arguments on the ethical and societal consequences of computing. Finish by drilling mixed calculation questions until the bits-to-bytes conversions and the kilobyte/kibibyte distinction are automatic, then move on to the next topic in the A-Level Computer Science (AQA) learning path.