You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Primary storage is the memory the CPU can reach directly — registers, cache, RAM and ROM — and it exists in a carefully engineered hierarchy that trades speed against capacity and cost. This lesson explains how each type works, why the hierarchy is shaped the way it is, and how it underpins everything the FDE cycle does.
This lesson develops OCR H446 section 1.1.2 (types of processor / memory) as it concerns primary (main) storage and the memory hierarchy. It covers RAM (contrasting DRAM and SRAM), ROM and its uses (firmware, BIOS/UEFI), the registers and the cache hierarchy (L1/L2/L3), the meaning of volatility, and the rationale for the memory hierarchy itself. It links tightly to the FDE cycle (1.1.1) — where registers and cache determine instruction throughput — and cross-links to virtual memory, which is treated fully in the software-systems memory management lesson rather than here.
Primary storage (main memory) is storage the CPU can address directly over the system bus, without going through an I/O controller. It spans the fast inner tiers — the CPU's own registers and cache — and the larger RAM and ROM chips on the motherboard. It is characterised by fast access but, for the RAM portion, volatility and limited capacity compared with secondary storage. The defining contrast with secondary storage (HDD/SSD) is directness: the CPU fetches an instruction from RAM by placing its address on the bus, whereas reaching a file on disk means asking a separate controller to retrieve a block.
Volatility is the key property to fix early:
| Term | Meaning | Examples |
|---|---|---|
| Volatile | Loses all contents when power is removed | Registers, cache, RAM (DRAM/SRAM) |
| Non-volatile | Retains contents without power | ROM, flash, all secondary storage |
RAM is volatile working memory: it holds the operating system, the currently running programs and their data while the machine is on, and is wiped when power is lost. "Random access" means any address takes the same time to reach — unlike a spinning disk, where physical position matters. RAM comes in two fundamentally different technologies.
| Feature | Detail |
|---|---|
| How it works | Each bit is one capacitor (plus one transistor). A charged capacitor = 1, discharged = 0. Charge leaks away, so every cell must be refreshed (read and rewritten) thousands of times a second |
| Density | High — just 1 transistor + 1 capacitor per bit, so billions of cells fit on a chip |
| Speed | Slower — refresh cycles and the time to sense a tiny capacitor charge add latency |
| Cost | Cheap per bit |
| Use | Main memory in PCs, laptops, phones, servers |
| Feature | Detail |
|---|---|
| How it works | Each bit is a flip-flop of typically 6 transistors that holds its state as long as power is supplied — no refresh needed |
| Density | Low — many transistors per bit, so far fewer bits per unit area |
| Speed | Very fast — no refresh delay; state is read directly |
| Cost | Expensive per bit |
| Use | CPU cache (L1/L2/L3) and registers |
| Feature | DRAM | SRAM |
|---|---|---|
| Storage mechanism | Capacitor (charge) | Flip-flop (6 transistors) |
| Needs refreshing? | Yes | No |
| Speed | Slower | Faster |
| Density | Higher | Lower |
| Cost per bit | Cheaper | More expensive |
| Used for | Main memory | Cache, registers |
Exam Tip: "Why does cache use SRAM, not DRAM?" SRAM is faster because it needs no refresh, so the CPU gets data in the fewest possible clock cycles. SRAM's higher cost and lower density are acceptable because cache is small — the very reason we cannot make all of main memory from SRAM.
RAM is "random access" precisely because of how it is wired to the processor, which ties straight back to the FDE cycle. Main memory is an array of numbered locations, and the CPU reaches any of them through the system buses met in the architecture lesson:
So a memory read is just the FDE pattern in miniature: the CPU places an address on the address bus (via the MAR), asserts read on the control bus, and the addressed RAM returns the contents on the data bus (into the MDR). Because any address is selected directly by its number — rather than by physically moving to it as on a disk — every location takes the same time to reach, which is the literal meaning of random access. This is also why "directly addressable by the CPU" is the dividing line between primary storage (on the buses) and secondary storage (reached through a controller).
It is worth understanding why the two RAM types behave so differently, because the mechanism is the source of every property in the comparison table. In DRAM, a 1 is a charged capacitor. A capacitor is just two conductors separated by an insulator, and no insulator is perfect, so the stored charge leaks away within milliseconds. Left alone, every 1 would decay into a 0 and the data would be lost — so a refresh circuit must read each row and write it back thousands of times a second to top the charge up. This refreshing is what makes DRAM slower (cells are periodically busy being refreshed and reading a tiny capacitor charge is delicate) and consumes power even when idle, but the one-transistor-one-capacitor cell is tiny, giving DRAM its density and low cost.
In SRAM, a bit is held by a flip-flop — typically six transistors cross-coupled so they latch each other into a stable 1 or 0 state. As long as power is supplied the latch actively holds its value, so there is nothing to refresh and reads are immediate. The price is that six transistors per bit make SRAM far larger and dearer per bit than DRAM — which is exactly why it is reserved for the small, speed-critical cache and registers, and DRAM is used for bulk main memory.
ROM is non-volatile — it keeps its contents with the power off — and holds permanent, essential code that rarely or never changes. Modern "ROM" is usually flash that can be updated occasionally (a firmware update), but it behaves as read-only in normal operation.
| Variant | Description |
|---|---|
| ROM | Contents fixed at manufacture; cannot be changed |
| PROM | Programmable once after manufacture with a special programmer |
| EPROM | Erasable by UV light, then reprogrammable |
| EEPROM | Electrically erasable and reprogrammable, byte by byte |
| Flash | A block-erasable form of EEPROM; the basis of firmware storage and SSDs |
The progression in that table is one of increasing rewritability: from completely fixed (ROM), to write-once (PROM), to UV-erasable (EPROM), to electrically byte-erasable (EEPROM), to fast block-erasable flash. This is why "ROM" today usually means flash that can be reprogrammed for a firmware update yet still behaves as read-only during normal use — the reason a phone or router can receive a security update without losing its non-volatile boot code.
The boot sequence makes the need for ROM alongside RAM concrete. Trace what happens at power-on:
This is the textbook answer to "why does a computer need both ROM and RAM?": without non-volatile ROM there would be no startup code after a power cycle to get the volatile RAM populated in the first place — a chicken-and-egg problem the ROM firmware solves.
Exam Tip: The cleanest "RAM vs ROM" discriminator is volatility plus role: ROM is non-volatile and holds the unchanging startup/boot firmware; RAM is volatile and holds whatever is running right now. If asked "why both?", give the boot argument — RAM is empty at power-on, so the first instructions must come from ROM.
At the very top of the hierarchy, inside the CPU, sit the registers: a handful of tiny, ultra-fast SRAM-based stores the ALU operates on directly, accessed in a single clock cycle. You met them in the FDE lesson — the PC, MAR, MDR, CIR and Accumulator/general-purpose registers. They are the smallest store (only tens to a few hundred words) but the fastest, which is exactly why they form the apex of the memory hierarchy below.
Registers are not a separate topic from this lesson — they are the place every memory access ends up. When the CPU reads from RAM, the address travels out through the MAR and the returned value arrives in the MDR; when an instruction is fetched it lands in the CIR; arithmetic results accumulate in the Accumulator or a general-purpose register. So the data path the whole lesson describes is really RAM → cache → registers → ALU and back: each tier exists to keep the registers — and therefore the ALU — supplied without stalling. This is why the registers belong at the top of the hierarchy: they are the fastest because the ALU operates on them directly every cycle, and everything below them (cache, then RAM, then disk) is a progressively larger, slower reservoir feeding them.
Cache is a small amount of very fast SRAM inside or beside the CPU that holds copies of recently and frequently used data and instructions, so the CPU rarely has to wait for slow DRAM. It sits between the registers and main memory.
| Level | Location | Typical size | Speed | Purpose |
|---|---|---|---|---|
| L1 | Inside each core | 32–128 KB per core | Fastest | Hottest data/instructions; usually split L1-I (instructions) + L1-D (data) |
| L2 | Inside/beside each core | 256 KB–1 MB per core | Fast | Backs up L1 for data that does not fit |
| L3 | Shared across all cores | 4–64 MB | Moderate (still ≫ RAM) | Shared pool; cuts trips to main memory |
The pattern is the hierarchy in miniature: smaller and faster closer to the core, larger and slower further out. L1 and L2 are typically per-core (private); L3 is shared so cores can exchange data without going to RAM.
When the CPU needs a data item or instruction:
| Term | Definition |
|---|---|
| Cache hit | Requested item found in cache — fast |
| Cache miss | Item not in cache — must fetch from slower RAM |
| Hit rate | Fraction of accesses that hit; higher is better |
Cache pays off because real programs do not access memory randomly; they obey two locality principles:
| Principle | Explanation | Example |
|---|---|---|
| Temporal locality | A recently used item is likely to be used again soon | A loop counter, a frequently called function |
| Spatial locality | Items near a used item are likely to be used soon | Successive elements of an array; the next instruction in sequence |
Spatial locality is why a miss pulls a whole cache line (block) from RAM, not a single byte — the neighbours are probably needed next.
Consider a loop summing an array, the kind of code that runs constantly:
total = 0
for i in range(1000):
total = total + data[i] # touches data[i] and total each pass
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.