AQA A-Level Computer Science: Data Structures — Complete Revision Guide (7517)
AQA A-Level Computer Science: Data Structures — Complete Revision Guide
Data structures are the vocabulary that every other topic in AQA A-Level Computer Science is written in. When a later question asks you to traverse a graph, evaluate an expression, manage a print buffer, store a symbol table, or implement Dijkstra's algorithm, it is quietly assuming you can already reach for the right structure and justify the choice. The specification places data structures in section 4.2, and the examiners treat them as a unifying thread rather than a self-contained island: the same array, stack, queue and tree you meet here resurface in the algorithms topic, in the theory of computation, and in your programming project. Master them once and you collect the dividend across the whole qualification.
These structures are assessed across all three components. Paper 1, the on-screen exam (2 hours 30 minutes, 100 marks, 40% of the A-Level), tests your ability to implement and operate structures in code — building a linked list, pushing and popping a stack, inserting into a binary search tree. Paper 2, the written exam (also 2 hours 30 minutes, 100 marks, 40%), tests your ability to define, trace and compare them on paper, and to argue the static-versus-dynamic trade-off examiners love to probe. The Non-Exam Assessment (NEA), the 75-mark programming project (20%), is where the practical pay-off lands: choosing and implementing an appropriate structure for your problem, and justifying it, is exactly the kind of design decision that evidences a strong solution. A candidate who can both define a structure precisely and operate it confidently by hand consistently outscores one who has only memorised a list of bullet points.
This is one of the core courses on the LearningBro AQA A-Level Computer Science learning path. The course, AQA A-Level CS: Data Structures, works through ten linked structures in a deliberate order — from the contiguous, fixed-size array right up to the meta-skill of choosing the right structure for a task. By the end you should be able to define each structure, state its operations, trace those operations by hand, compare structures for a given problem, and explain the static-versus-dynamic distinction that runs through the whole topic.
Guide Overview
This course is a ten-lesson sequence that builds from linear, contiguous storage through linked and hierarchical structures to the judgement needed to choose between them. Work through the lessons in order, because each one assumes the operations introduced in the last.
- Static and Dynamic Data Structures
- Arrays and Records
- Linked Lists
- Stacks
- Queues
- Hash Tables
- Trees and Binary Trees
- Binary Search Trees
- Graphs
- Choosing the Right Data Structure
Static and Dynamic Data Structures
The Static and Dynamic Data Structures lesson opens the course with the single distinction that the whole topic turns on, so getting it crisp here pays off in every later lesson. A static data structure has a fixed size, decided at the point of declaration (compile time or allocation time), and reserves all its memory up front whether or not it is used. The array is the archetype. A dynamic data structure grows and shrinks at run time, allocating memory as elements are added and releasing it as they are removed; the linked list, the stack-on-a-linked-list and the tree are the archetypes.
The trade-off is the spine of the topic. A static structure is simple and gives fast, direct access, but it can waste memory if over-sized and overflow if under-sized — you must guess the maximum in advance. A dynamic structure uses exactly as much memory as it currently needs and never fixes a maximum, but it pays a memory overhead (a pointer per node) and loses direct indexed access, because you must follow pointers to reach an element. AQA expects you to connect this to how the memory comes from: dynamic structures typically draw memory from the heap at run time, often via a pointer or reference, whereas static allocation is fixed in advance. A common exam discriminator is precision of language: be able to say static = fixed size, set at declaration and dynamic = grows and shrinks at run time, uses heap memory without blurring the two, and be ready to recommend one over the other for a scenario with a stated reason.
Arrays and Records
The Arrays and Records lesson starts where memory itself starts: with a block of contiguous cells. An array is a collection of elements of the same data type, stored in adjacent memory locations and accessed by an integer index. That contiguity is the whole point — because every element occupies the same number of bytes and the elements sit next to each other, the machine can compute the address of element i directly, giving constant-time random access. You should be able to state and use the address formula: for a one-dimensional array with base address B, element size w and lower bound 0, the address of element i is B + i * w. AQA also expects you to handle two-dimensional arrays (and to picture them as a table or an array of arrays) and to know that arrays may be extended to more dimensions.
A record (or struct) groups together a fixed set of named fields that may be of different data types — a single Student record might hold a string name, an integer ID and a real average. Records and arrays compose naturally: an array of records is the canonical in-memory representation of a flat file or database table, and you will meet exactly this idea again in the databases topic. The cost of an array is rigidity: it is a static structure, so inserting in the middle means shuffling every later element up by one, and deleting means shuffling them down. The exam discriminator is precision — array = same type, indexed, contiguous, fixed size versus record = mixed types, named fields — without conflating the two. The common pitfall is forgetting the address calculation, which examiners set as a reliable, mechanical mark.
Linked Lists
The Linked Lists lesson introduces the alternative to contiguity: storage held together by pointers rather than by physical adjacency. A linked list is a sequence of nodes, each node holding a data field and a pointer to the next node; a special null pointer marks the end, and a separate head pointer locates the start. Because the nodes can live anywhere in memory, the list is dynamic — it grows and shrinks at run time as nodes are allocated and freed, with no need to declare a maximum size in advance.
The trade-off is the mirror image of the array's. Insertion and deletion at a known position are cheap, because you only rewire a couple of pointers rather than shuffling a whole block. But you lose random access: to reach the nth node you must start at the head and follow n pointers, a linear traversal. AQA expects you to trace pointer manipulation precisely — inserting a node means setting the new node's pointer to its successor before redirecting the predecessor's pointer, or you orphan the rest of the list. You should also recognise that a linked list is often implemented on top of an array, using a parallel array of "next" indices and a free list of unused slots — a neat way to get dynamic behaviour inside a static structure, and a favourite exam representation.
The table below crystallises the array-versus-linked-list comparison that examiners return to again and again.
| Property | Array | Linked list |
|---|---|---|
| Storage | Contiguous | Scattered, joined by pointers |
| Sizing | Static (fixed) | Dynamic (grows at run time) |
Access element i | Constant time (direct) | Linear (follow pointers) |
| Insert/delete in middle | Costly (shuffle elements) | Cheap (rewire pointers) |
| Memory overhead | None per element | Pointer field per node |
The common pitfall is mishandling the pointer order on insertion or deletion; trace it on a diagram every time, and state explicitly when a node is being orphaned or a list traversal has reached null.
Stacks
The Stacks lesson introduces the first of two structures that are not new ways of storing data so much as new disciplines on how data may enter and leave. A stack is a Last In, First Out (LIFO) structure: items are pushed onto the top and popped from the top, so the most recently added item is always the first removed. The core operations are push, pop, peek (inspect the top without removing it), isEmpty and isFull. Implemented over an array, a stack needs a single stack pointer holding the index of the top; push increments it and writes, pop reads and decrements.
You must be able to detect stack overflow (pushing when full) and stack underflow (popping when empty), and to trace a sequence of pushes and pops, writing the stack contents and pointer at each step. Just as important is being able to explain the canonical applications: the call stack that stores return addresses and local variables for subroutines (the mechanism behind the recursion you study in the programming course), the back button in a browser, undo functionality, and the evaluation of expressions in Reverse Polish Notation. A common exam discriminator is naming a concrete application and explaining why LIFO is the right discipline for it — for recursion, because the most recently called subroutine must return first. The pitfall is confusing the direction of the pointer adjustment on push versus pop, which a careful trace eliminates.
Queues
The Queues lesson covers the complementary discipline: a First In, First Out (FIFO) structure that models any fair waiting line. Items are enqueued at the rear and dequeued from the front, so the item that has waited longest is served first. The subtlety AQA cares about is the implementation. A linear queue held in an array uses two pointers, front and rear; but as items are dequeued the front pointer marches up the array, eventually stranding unused space at the start. The fix is the circular queue, where the pointers wrap around to the beginning using modulo arithmetic (rear = (rear + 1) MOD size), reusing the freed slots and giving a genuinely fixed-memory FIFO buffer.
You should also know the priority queue, in which each item carries a priority and the highest-priority item is dequeued first regardless of arrival order — the structure behind scheduling and, later, behind the frontier in Dijkstra's algorithm. AQA expects you to trace enqueue and dequeue operations on a linear and a circular queue, showing the front and rear pointers wrapping around, and to explain why the circular form is preferred (it reclaims space without shuffling). Stacks and queues together are the clearest illustration in the whole topic that the same array can host different abstract behaviours — the LIFO discipline of a stack and the FIFO discipline of a queue from identical underlying storage. The common pitfall is forgetting the modulo wrap on the circular queue, or losing track of whether an empty and a full queue are distinguishable, which careful pointer bookkeeping resolves.
Hash Tables
The Hash Tables lesson tackles the problem of looking something up by key rather than by index — finding a record by username, say, rather than by position. A hash table stores key–value pairs in an array of "buckets" and uses a hash function to convert a key into an array index directly, so that lookup, insertion and deletion run in near-constant time on average. A simple hash function for an integer key is index = key MOD tableSize, and you should be able to apply a stated hash function to a list of keys and place each in the table.
The crux of the topic is collisions — two distinct keys hashing to the same index — and how to resolve them. AQA expects you to know at least two strategies:
- Open addressing with linear probing: on a collision, step forward to the next free slot (wrapping around with modulo). Simple, but prone to clustering, where occupied slots bunch together and lengthen later searches.
- Chaining (closed addressing): each bucket holds a linked list, and colliding keys are appended to that list. This degrades gracefully and never "fills up" in the same way.
You should be able to define and reason about the load factor — the ratio of stored items to table size — and to explain that as the load factor rises, collisions become more frequent and performance degrades towards linear, which is why a good hash table is resized (rehashed) once it grows too full. A strong answer can characterise a good hash function: it should distribute keys uniformly across the table, be fast to compute, and use the whole key. The common pitfall is treating the average-case near-constant lookup as if it were guaranteed; under a high load factor or a poor hash function it can collapse to linear, and naming that worst case is itself a higher-level mark. Hash tables are the natural implementation of the dictionary / associative array, the structure you reach for whenever fast key-based access matters.
Trees and Binary Trees
The Trees and Binary Trees lesson moves from linear to hierarchical structure. A tree is a connected, acyclic graph with a designated root; every other node has exactly one parent, and nodes with no children are leaves. A binary tree restricts each node to at most two children, conventionally the left and right child. You must be fluent in the vocabulary the exam uses: node, edge, root, parent, child, sibling, leaf, subtree, depth (distance from the root) and height (longest path to a leaf).
The skill that earns marks is traversal. You must be able to traverse a binary tree in three orders, stating both the recursive rule and the resulting sequence for a given diagram:
- Pre-order (root, left, right) — used to copy a tree or produce the prefix form of an expression tree.
- In-order (left, root, right) — for a binary search tree, visits the nodes in ascending sorted order, the single most-tested fact about tree traversal.
- Post-order (left, right, root) — used to delete a tree safely (children before parent) and to produce Reverse Polish Notation from an expression tree.
A reliable mnemonic is to note where the root sits in the name: pre-, in- and post- describe whether the root is visited before, between or after its subtrees. AQA frequently sets an expression tree and asks for its in-order, pre-order or post-order traversal, linking trees straight back to the stack-based RPN evaluation. Because tree traversal is naturally recursive, this lesson also reinforces the recursion you met in the programming course — each recursive call descends into a subtree, and the call stack unwinds back up. The common pitfall is muddling the three orders; rehearse all three on the same diagram until the pattern is automatic.
Binary Search Trees
The Binary Search Trees lesson specialises the binary tree into a structure built for fast searching. A binary search tree (BST) maintains an ordering invariant: for every node, all values in its left subtree are smaller and all values in its right subtree are larger. That invariant turns searching into a halving process — at each node you compare the target with the node's value and descend left or right — so a balanced BST supports search, insertion and deletion in time proportional to its height, which is logarithmic in the number of nodes.
You must be able to build a BST by inserting a sequence of values in order: each new value starts at the root and follows the ordering rule down to a null pointer, where it becomes a new leaf. You should also be able to search for a value by the same descent, and to read off the in-order traversal to confirm the sorted ordering. AQA expects you to understand deletion, including the awkward case of removing a node with two children, which is resolved by replacing it with its in-order successor (the smallest value in its right subtree). The most important higher-level point is balance: if values are inserted in already-sorted order, the BST degenerates into a structure that resembles a linked list, and search degrades from logarithmic to linear in the number of nodes. Being able to draw that degenerate worst case, and explain that BST performance depends on the tree being reasonably balanced, is exactly the kind of evaluative answer that lifts a response into the top band. The common pitfall is asserting "BST search is fast" without the balance caveat.
Graphs
The Graphs lesson generalises the tree by dropping its restrictions: a graph is simply a set of vertices (nodes) joined by edges, with no requirement to be acyclic and no single root. Graphs model anything relational — road and rail networks, social connections, web links, computer networks and state machines. You must distinguish a directed graph (edges have a direction, like one-way streets) from an undirected graph, and a weighted graph (edges carry a cost, distance or capacity) from an unweighted one.
The examinable heart of this lesson is the two ways to represent a graph in memory, and the trade-off between them.
| Representation | What it stores | Strong when | Weak when |
|---|---|---|---|
| Adjacency matrix | An n×n grid; cell [i][j] holds 1 (or the weight) if an edge exists | The graph is dense; you need constant-time edge lookups | The graph is sparse — most cells are empty, wasting memory |
| Adjacency list | For each vertex, a list of its neighbours | The graph is sparse; you want to iterate over a vertex's neighbours | You need to test a specific edge quickly |
You should be able to read a small graph diagram and write out both its adjacency matrix and its adjacency list, and to argue which representation suits a scenario — a near-complete graph favours the matrix, a sparse social network favours the list. The graph traversals that operate on these representations (breadth-first and depth-first search) belong to the algorithms course, but recognising here that DFS is naturally driven by a stack and BFS by a queue ties the topic together: the stack and queue of the earlier lessons are the machinery of the graph algorithms to come. The common pitfall is choosing the matrix for a sparse graph (wasting memory) or failing to record edge direction in a directed graph's matrix.
Choosing the Right Data Structure
The Choosing the Right Data Structure lesson is the conceptual capstone, and it is where strong candidates pull ahead — and where the NEA design decision is really made. The skill is not recalling definitions but matching a structure to a problem and justifying the match against the operations the problem needs, the memory available, and whether the data set's size is known in advance. Every structure in the course is a bundle of trade-offs; the right choice is the one whose strengths align with the dominant operations and whose weaknesses do not matter for the task.
The reasoning runs through a few recurring questions. Do you need fast indexed access to a fixed-size collection? An array. Do you need frequent insertion and deletion with an unknown final size? A linked list or another dynamic structure. Do you need LIFO or FIFO ordering? A stack or a queue respectively. Do you need fast key-based lookup? A hash table (a dictionary). Do you need to keep data sorted with reasonably fast search and insertion? A binary search tree. Are you modelling arbitrary relationships? A graph. The decisive higher-level move is to weigh the static-versus-dynamic trade-off from the first lesson against the access pattern, and to state the justification in terms the marker can see — "a hash table, because the dominant operation is lookup by key and near-constant average access outweighs the unsorted iteration order".
| Need | Structure | Why |
|---|---|---|
| Fast indexed access, fixed size | Array | Constant-time direct access |
| Frequent insert/delete, unknown size | Linked list | Cheap pointer rewiring; dynamic sizing |
| LIFO ordering | Stack | Push/pop at one end |
| FIFO ordering | Queue | Enqueue at rear, dequeue at front |
| Fast lookup by key | Hash table | Near-constant average lookup |
| Sorted data with fast search | Binary search tree | Logarithmic search when balanced |
| Arbitrary relationships | Graph | Models vertices and edges directly |
The common pitfall is naming a structure without justifying it against the problem's operations and constraints; the marks are entirely in the reasoning. Treat every "choose and justify" question as an invitation to weigh trade-offs out loud.
How to Revise Data Structures
Revise data structures by operating them, not by reading about them. For each structure, draw a blank diagram and trace the core operations by hand: push and pop a stack until it overflows and underflows; enqueue and dequeue a circular queue until the pointers wrap around; insert a sequence of values into a binary search tree and then read off all three traversals; apply a stated hash function to a list of keys and resolve the collisions twice, once with linear probing and once with chaining; rewire a linked-list insertion on paper and check you never orphan the tail. Build a one-page comparison table that, for every structure, records whether it is static or dynamic, its access cost, and one realistic application — these "compare and justify" questions are reliable marks. Finally, rehearse the precise definitions, because examiners reward the candidate who says array = same type, contiguous, indexed, static and static = fixed size set at declaration; dynamic = grows at run time without hedging.
When you can take any small diagram and confidently trace its operations, choose the right structure for a described task and defend the choice, and articulate both the static-versus-dynamic distinction and each structure's headline trade-off, you have the foundation the rest of the qualification quietly relies on. Work through every lesson in AQA A-Level CS: Data Structures, then carry that fluency straight into the algorithms course, where these structures become the moving parts of the searches, sorts and traversals that the AQA A-Level Computer Science learning path builds towards.
Next Steps
Stop reading and start tracing. Open AQA A-Level CS: Data Structures and work through the ten lessons in order, operating each structure on paper as you go, then continue along the AQA A-Level Computer Science learning path into algorithms — where the stacks, queues, trees and graphs you have just mastered drive the searches, sorts and shortest-path methods of the next course.