OCR A-Level Computer Science: Data Structures — Complete Revision Guide (H446)
OCR A-Level Computer Science: Data Structures — Complete Revision Guide
Data structures are the vocabulary that every other topic in OCR A-Level Computer Science is written in. When a later question asks you to traverse a graph, parse an expression, manage a print buffer, or store a symbol table, it is silently assuming you can already reach for the right structure and justify the choice. The H446 specification places data structures in section 1.4.2, and the examiners treat them as a unifying thread rather than a self-contained island: the same array, stack, queue and tree that you meet here resurface in algorithms, in the theory of computation, and in your programming project. Master them once and you spend the rest of the course collecting the dividend.
This material is examined primarily in Component 01 (Computer systems), the written paper that covers the 1.x sections of the specification, but the practical fluency you build here is exercised again in Component 02 (Algorithms and programming) whenever a question asks you to implement or reason about an algorithm that depends on a particular structure. A stack is not just a 1.4.2 definition — it is the engine behind recursion, expression evaluation and depth-first search. A queue underpins breadth-first traversal and any fair scheduling problem. Because the structures are load-bearing across the whole qualification, the candidates who can both define them precisely and operate them confidently on paper consistently outscore those who have only memorised a list of bullet points.
This is Course 5 of 11 on the LearningBro OCR A-Level Computer Science learning path. The course, OCR A-Level CS: Data Structures, works through ten linked structures in a deliberate order — from the contiguous, fixed-size array right up to the abstract data type, the idea that lets you separate what a structure does from how it is built. By the end you should be able to define each structure, state its operations, trace those operations by hand, compare structures for a given task, and explain the static-versus-dynamic trade-off that examiners love to probe.
Guide Overview
This course is a ten-lesson sequence that builds from linear, contiguous storage through linked and hierarchical structures to the abstraction that ties them together. Work through them in order, because each lesson assumes the operations introduced in the last.
- Arrays and Records
- Linked Lists
- Stacks
- Queues
- Binary Trees
- Tree Operations
- Graphs
- Hash Tables
- Vectors
- Abstract Data Types
Arrays and Records: Contiguous, Static Storage
The Arrays and Records lesson starts where memory itself starts: with a block of contiguous cells. An array is a collection of elements of the same data type, stored in adjacent memory locations and accessed by an integer index. That contiguity is the whole point — because every element occupies the same number of bytes and the elements sit next to each other, the machine can compute the address of element i with a single multiplication and addition, giving constant-time random access. You should be able to state and use the address formula: for a one-dimensional array with base address B, element size w and lower bound 0, the address of element i is B + i * w. OCR also expects you to handle two-dimensional arrays and to understand row-major order, where the rows are laid out one after another in memory, so the address of element (r, c) in an array with C columns is B + (r * C + c) * w.
The cost of an array is rigidity. It is a static structure: its size is typically fixed at the point of declaration, so inserting an element in the middle means shuffling every later element up by one, and deleting means shuffling them down. A record (or struct), by contrast, groups together a fixed set of fields that may be of different data types — a single Student record might hold a string name, an integer ID and a real average. Records and arrays compose naturally: an array of records is the canonical in-memory representation of a flat database table, and you will meet exactly this idea again in the databases course. The exam discriminator here is precision of language — being able to say array = same type, indexed, contiguous, fixed size and record = mixed types, named fields without blurring the two.
Linked Lists: Dynamic, Pointer-Based Storage
The Linked Lists lesson introduces the alternative to contiguity: storage held together by pointers rather than by physical adjacency. A linked list is a sequence of nodes, each node holding a data field and a pointer to the next node; a special null pointer marks the end, and a separate head pointer locates the start. Because the nodes can live anywhere in memory, the list is dynamic — it grows and shrinks at run time as nodes are allocated and freed, with no need to declare a maximum size in advance.
The trade-off is the mirror image of the array's. Insertion and deletion at a known position are cheap, because you only rewire a couple of pointers rather than shuffling a whole block. But you lose random access: to reach the nth node you must start at the head and follow n pointers, a linear traversal. OCR expects you to trace pointer manipulation precisely — inserting a node means setting the new node's pointer to the successor before redirecting the predecessor's pointer, or you orphan the rest of the list. You should also know the variants: a doubly linked list stores a backward pointer as well, allowing two-way traversal at the cost of extra memory, and a circular linked list has its last node point back to the first. A common, often-implicit exam point is that a linked list is frequently implemented on top of an array, using a parallel array of "next" indices and a free-list of unused slots — a neat way to get dynamic behaviour inside a static structure.
The table below crystallises the array-versus-linked-list comparison that examiners return to again and again.
| Property | Array | Linked list |
|---|---|---|
| Storage | Contiguous | Scattered, joined by pointers |
| Sizing | Static (fixed) | Dynamic (grows at run time) |
Access element i | Constant time (direct) | Linear (follow pointers) |
| Insert/delete in middle | Costly (shuffle elements) | Cheap (rewire pointers) |
| Memory overhead | None per element | Pointer field per node |
Stacks and Queues: Disciplined Access
Stacks and queues are not new ways of storing data so much as new disciplines on how data may enter and leave. The Stacks lesson covers the Last In, First Out (LIFO) structure: items are pushed onto the top and popped from the top, so the most recently added item is always the first removed. The core operations are push, pop, peek (or top, which inspects the top item without removing it), isEmpty and isFull. Implemented over an array, a stack needs a single stack pointer holding the index of the top; push increments it and writes, pop reads and decrements. You must be able to check for stack overflow (push when full) and stack underflow (pop when empty), and to explain the canonical applications: the call stack that stores return addresses and local variables for subroutines, the back button in a browser, and the conversion and evaluation of expressions in Reverse Polish Notation.
The Queues lesson covers the First In, First Out (FIFO) structure: items are enqueued at the rear and dequeued from the front, modelling any fair waiting line. The subtlety OCR cares about is implementation. A linear queue held in an array uses two pointers, front and rear; but as items are dequeued the front pointer marches up the array, eventually stranding unused space at the start. The fix is the circular queue, where the pointers wrap around to the beginning using modulo arithmetic (rear = (rear + 1) MOD size), reusing the freed slots and giving a genuinely fixed-memory FIFO buffer. You should also know the priority queue, in which each item carries a priority and the highest-priority item is dequeued first regardless of arrival order — the structure behind scheduling and, later, behind Dijkstra's algorithm. Stacks and queues are the clearest illustration in the whole course that the same array can host different abstract behaviours, which is exactly the idea the final lesson formalises.
Binary Trees and Tree Operations
The Binary Trees lesson moves from linear to hierarchical structure. A tree is a connected, acyclic graph with a designated root; every other node has exactly one parent, and nodes with no children are leaves. A binary tree restricts each node to at most two children, conventionally called the left and right child. The structure that earns its keep in exams is the binary search tree (BST), which maintains an ordering invariant: for every node, all values in its left subtree are smaller and all values in its right subtree are larger. That invariant turns searching into a halving process — at each node you compare and descend left or right — so a balanced BST supports search, insertion and deletion in time proportional to its height, which is logarithmic in the number of nodes when the tree is balanced. You should be ready with the vocabulary the exam uses: node, edge, root, parent, child, leaf, subtree, depth (distance from the root) and height (longest path to a leaf).
The Tree Operations lesson is where binary trees become an exam skill rather than a definition. You must be able to traverse a tree in three orders, and to state both the recursive rule and the result for a given diagram:
- In-order (left, root, right) — visits a BST's nodes in ascending sorted order, which is the single most-tested fact about tree traversal.
- Pre-order (root, left, right) — used to copy a tree or to produce the prefix form of an expression tree.
- Post-order (left, right, root) — used to delete a tree safely (children before parent) and to produce Reverse Polish Notation from an expression tree.
A reliable way to remember which is which is to note where the root sits in the name: pre-, in- and post- describe whether the root is visited before, between, or after its subtrees. OCR frequently sets an expression tree and asks for its in-order, pre-order or post-order traversal, linking trees straight back to the stack-based RPN evaluation you met earlier. Be able to insert into a BST by following the ordering invariant down to a null pointer, and to describe — at least in outline — deletion, including the awkward case of removing a node with two children, which is resolved by replacing it with its in-order successor.
Graphs: Modelling Arbitrary Relationships
The Graphs lesson generalises the tree by dropping its restrictions: a graph is simply a set of vertices (nodes) joined by edges, with no requirement to be acyclic and no single root. Graphs model anything relational — road and rail networks, social connections, web links, computer networks and state machines. You must distinguish a directed graph (edges have a direction, like one-way streets) from an undirected graph, and a weighted graph (edges carry a cost, distance or capacity) from an unweighted one.
The examinable heart of this lesson is the two ways to represent a graph in memory, and the trade-off between them.
| Representation | What it stores | Strong when | Weak when |
|---|---|---|---|
| Adjacency matrix | An n×n grid; cell [i][j] holds 1 (or the weight) if an edge exists | The graph is dense; you need constant-time edge lookups | The graph is sparse — most cells are empty, wasting memory |
| Adjacency list | For each vertex, a list of its neighbours | The graph is sparse; you want to iterate over a vertex's neighbours | You need to test a specific edge quickly |
You should be able to read a small graph diagram and write out both its adjacency matrix and its adjacency list, and to argue which representation suits a given scenario — a near-complete graph favours the matrix, a sparse social network favours the list. The graph traversals that operate on these representations (depth-first and breadth-first search) belong to the algorithms course, but recognising here that DFS is naturally driven by a stack and BFS by a queue ties the entire course together: the structures of lessons three and four are the machinery of the algorithms in lesson seven of the next course.
Hash Tables: Near-Constant-Time Lookup
The Hash Tables lesson tackles the problem of looking something up by key rather than by index — finding a record by username, say, rather than by position. A hash table stores key–value pairs in an array of "buckets" and uses a hash function to convert a key into an array index directly, so that lookup, insertion and deletion run in near-constant time on average. A simple hash function for an integer key is index = key MOD tableSize, and you should be able to apply a stated hash function to a list of keys and place them in the table.
The crux of the topic is collisions — two distinct keys hashing to the same index — and how to resolve them. OCR expects you to know at least:
- Open addressing with linear probing: on a collision, step forward to the next free slot (wrapping around with modulo). Simple, but prone to clustering, where occupied slots bunch together and lengthen searches.
- Chaining (closed addressing): each bucket holds a linked list, and colliding keys are appended to that list. This degrades gracefully and never "fills up" in the same way.
You should be able to define and reason about the load factor — the ratio of stored items to table size — and to explain that as the load factor rises, collisions become more frequent and performance degrades toward linear, which is why a good hash table is resized (rehashed) once it grows too full. A strong answer can characterise a good hash function: it should distribute keys uniformly across the table, be fast to compute, and use the whole key. Hash tables are the natural implementation of the dictionary / associative array abstract data type, which connects this lesson directly to the final one.
Vectors: The Mathematical Structure
The Vectors lesson covers the more mathematical notion of a vector that H446 includes alongside the everyday data structures. A vector here is an ordered tuple of values — for example a point or direction in n-dimensional space — and OCR expects you to interpret a vector in several equivalent ways: as a list, as a function from an index set to values, and as a geometric arrow. The examinable operations are scalar–vector multiplication (multiplying every component by a constant, which scales the arrow), vector addition (adding component-wise, which composes two displacements), the dot (scalar) product (multiply corresponding components and sum, producing a single number related to the angle between the vectors), and convex combination (a weighted average of vectors whose weights are non-negative and sum to one, which always lands inside the shape they span).
The reason vectors sit in a computer science specification, rather than only in mathematics, is their role in graphics, machine learning and simulation, where positions, colours, forces and feature sets are all represented as vectors and manipulated with exactly these operations. Treat this lesson as practice in applying the definitions cleanly: compute a dot product, scale a vector, add two vectors and form a convex combination, and be ready to interpret what each operation means geometrically.
Abstract Data Types: Separating What from How
The Abstract Data Types lesson is the conceptual capstone, and it is where strong candidates pull ahead. An abstract data type (ADT) is defined by the operations it offers and their behaviour — its interface — and deliberately says nothing about how it is implemented. A stack ADT promises push, pop and peek with LIFO behaviour; whether that stack is built on an array or a linked list is an implementation detail invisible to the code that uses it. This separation of interface from implementation is the single most important idea in the course, because it explains why the same logical structure can have several concrete realisations, each with its own performance profile.
You must hold two independent distinctions clearly in mind. The first is ADT versus data structure: the ADT describes what operations exist; the data structure describes how the storage works. The second is static versus dynamic: a static structure (like an array) has a fixed size set at compile or declaration time, while a dynamic structure (like a linked list or a tree) grows and shrinks at run time. These two distinctions are orthogonal — the same ADT, say a stack, can be implemented either statically over an array or dynamically over a linked list. Being able to name standard ADTs (stack, queue, list, dictionary, tree, graph), describe each by its operations rather than its storage, and then offer a sensible underlying implementation with a stated trade-off is precisely the kind of layered, evaluative answer that lifts a response from competent to top-band.
How to Revise Data Structures
Revise data structures by operating them, not by reading about them. For each structure, draw a blank diagram and trace the core operations by hand: push and pop a stack until it overflows and underflows; enqueue and dequeue a circular queue until the pointers wrap around; insert a sequence of values into a binary search tree and then read off all three traversals; apply a stated hash function to a list of keys and resolve the collisions twice, once with linear probing and once with chaining. Build a one-page comparison table that, for every structure, records whether it is static or dynamic, its access cost, and one realistic application — these "compare and justify" questions are reliable marks. Finally, rehearse the precise definitions: examiners reward the candidate who says array = same type, contiguous, indexed, static and ADT = operations only, implementation hidden without hedging.
When you can take any small diagram and confidently trace its operations, choose the right structure for a described task and defend the choice, and articulate both the ADT-versus-data-structure and static-versus-dynamic distinctions, you have the foundation the rest of H446 quietly relies on. Work through every lesson in OCR A-Level CS: Data Structures, then carry that fluency straight into Course 6 on algorithms, where these structures become the moving parts of the searches, sorts and traversals that the OCR A-Level Computer Science learning path builds towards.