TCP/IP in Detail

The previous lesson argued that protocols are organised into layers and that this layering is an act of abstraction. This lesson cashes that idea out concretely. The TCP/IP model — also called the Internet protocol suite — is the actual four-layer stack on which the entire Internet runs, and it is the model OCR expects you to know. Here we walk down its four layers (application, transport, internet, link), watch data being encapsulated as it descends the stack, and study the machinery each layer provides: IP addressing (the IPv4 structure and classes, and the idea of IPv6), ports and sockets at the transport layer, and the all-important choice between TCP and UDP. Along the way we cover the supporting services — DNS, DHCP, NAT and subnetting — that make addressing workable at scale.

The thread running through everything is each layer adds exactly the addressing its counterpart at the far end needs: the transport layer adds port numbers so the right program receives the data, the internet layer adds IP addresses so the right host receives it, and the link layer adds hardware (MAC) addresses so the right device on the local wire receives it. Hold that thread and the whole stack becomes coherent rather than a list of acronyms.

Spec Mapping

This lesson develops the OCR H446 section 1.3.3 / 1.3.4 material on the TCP/IP stack and addressing, specifically:

The four layers of the TCP/IP model — application, transport, internet (network), link — and the distinct job each performs.
Encapsulation: how a unit of data gains a header at each layer on the way down, and is unwrapped on the way up; the names of the data units (segment, packet, frame).
IP addressing: the 32-bit IPv4 structure (dotted-decimal octets), the historic address classes and the network/host split, and the motivation for and shape of 128-bit IPv6.
Ports and sockets: how a 16-bit port number identifies a service, and how an IP-address-plus-port socket identifies one endpoint of a connection.
TCP vs UDP: connection-oriented reliable delivery versus connectionless best-effort delivery, and when each is the right choice.
The supporting services DNS (name resolution), DHCP (automatic configuration), NAT (address sharing) and subnetting (dividing an address range).

Examiners reward candidates who can place a job at the correct layer and who can justify a TCP-versus-UDP choice from the needs of an application, rather than reciting a feature table.

The TCP/IP Four-Layer Model

The Internet does not run on one monolithic protocol; it runs on a stack of four layers, each solving one part of the problem and each speaking only to the layers immediately above and below it. From the user's program at the top to the physical wire at the bottom:

graph TD
    A["Application layer<br/>HTTP, HTTPS, FTP, SMTP, DNS<br/>— what the user's program speaks"]
    T["Transport layer<br/>TCP or UDP<br/>— program-to-program delivery, ports"]
    I["Internet layer<br/>IP, ICMP<br/>— host-to-host addressing & routing"]
    L["Link layer<br/>Ethernet, Wi-Fi (802.11)<br/>— bits onto the local medium, MAC addresses"]
    A --> T --> I --> L

Layer	Core job	Addresses it uses	Example protocols	Data unit
Application	Provide the service the user actually wants — fetch a page, send mail, resolve a name	(none of its own)	HTTP, HTTPS, FTP, SMTP, IMAP, DNS	message / data
Transport	Deliver data to the right program on a host; optionally guarantee reliability and ordering	Port numbers	TCP, UDP	segment (TCP) / datagram (UDP)
Internet	Deliver a packet from one host to another across interconnected networks; choose a route	IP addresses	IP, ICMP	packet
Link	Move bits across one physical hop of the local medium	MAC addresses	Ethernet, Wi-Fi	frame

The crucial mental model is each layer talks logically to its peer at the other end. Your browser's HTTP layer behaves as though it is conversing directly with the server's HTTP layer; the two TCP layers behave as though they have a private reliable pipe between them; the two IP layers behave as though they are adjacent. In reality every byte travels all the way down your stack, across the physical network (often through many intermediate devices), and all the way up the other stack — but the abstraction of peer layers in conversation is what makes the design comprehensible.

A note on the OSI seven-layer model, which you may meet elsewhere: OSI is a more granular reference model (seven layers, splitting the application work into application/presentation/session and the link work into data-link/physical). TCP/IP is the model the real Internet actually implements, and it is the one OCR sets. Where a question says "the TCP/IP model", give four layers; do not blend the two.

Encapsulation Down the Stack

As data descends the stack on the sending host, each layer wraps what it receives from the layer above inside its own header (the link layer also adds a trailer). This wrapping is called encapsulation, and the header each layer adds carries exactly the information its peer layer at the destination will need to do its job.

graph TD
    subgraph Sender["Sending host — encapsulation (down)"]
        D1["Application data"]
        D2["TCP header + data = SEGMENT"]
        D3["IP header + segment = PACKET"]
        D4["Frame header + packet + trailer = FRAME"]
        D1 -->|add port numbers| D2 -->|add IP addresses| D3 -->|add MAC addresses + checksum| D4
    end
    D4 -->|bits on the wire| R["Receiving host:<br/>de-encapsulation (up) —<br/>each layer strips its own header<br/>and passes the rest upward"]

Walking the layers on the way down:

The application hands its data (say an HTTP request) to the transport layer.
The transport layer (TCP here) prepends a header containing the source and destination port numbers, a sequence number and a checksum. The result is a segment.
The internet layer prepends an IP header containing the source and destination IP addresses and the TTL. The result is a packet.
The link layer prepends a frame header with source and destination MAC addresses and appends a trailer holding an error-check value. The result is a frame, which is finally transmitted as bits.

At the destination the process runs in reverse — de-encapsulation: the link layer checks the frame, strips its header/trailer and passes the packet up; the internet layer checks the IP header, strips it and passes the segment up; the transport layer checks ports and ordering, strips its header and passes the data to the right application. Each layer reads only its own header and never inspects the layers above it — that is encapsulation enforcing the layered abstraction. (This same packet structure of header/payload/trailer, and what happens to a packet as it is routed, is the subject of the next lesson, Packet Switching.)

IP Addressing

The internet layer identifies hosts using IP addresses. An IP address is a numeric label assigned to every interface on an IP network so that packets can be routed to it. There are two versions in use: the long-established IPv4 and its successor IPv6.

IPv4 structure

IPv4 uses a 32-bit address, conventionally written as four 8-bit octets in dotted-decimal notation — for example 192.168.1.100. Each octet is a value 0–255 (because 8 bits gives $2^{8}=256$ values), so the whole space runs from 0.0.0.0 to 255.255.255.255.

Property	Value
Total bits	32 (four octets of 8 bits)
Notation	Dotted-decimal, e.g. `172.16.254.1`
Range per octet	0 to 255
Total addresses	$2^{32}\approx 4.3$ billion

The size of the address space is a calculation you should be able to do: with 32 bits the number of distinct addresses is

2^{32} = 4\,294\,967\,296 \approx 4.3 \times 10^{9}

That sounded inexhaustible in the 1980s but is far smaller than the number of Internet-connected devices today, which is the headline reason IPv6 exists and why workarounds such as NAT (below) became universal.

Every IP address splits into a network part (which network the host is on) and a host part (which host within that network). Routers use the network part to decide where to send a packet, and only the final network cares about the host part.

IPv4 address classes

Historically, the boundary between the network and host parts was set by address classes, identified by the leading bits / first octet:

Class	First octet range	Default split (network / host)	Intended use
A	1–126	8 bits network / 24 host	Very large networks (≈16 million hosts each)
B	128–191	16 / 16	Medium networks (≈65 000 hosts each)
C	192–223	24 / 8	Small networks (254 usable hosts each)
D	224–239	(multicast)	Multicast groups
E	240–255	(reserved)	Experimental / reserved

Classful addressing was wasteful — an organisation needing 300 hosts had to take a whole Class B of 65 000, stranding the rest — so it was replaced in practice by classless addressing (CIDR), where the network/host boundary can fall at any bit position and is written as a slash-suffix, e.g. /24. You should know the classes as background but understand that modern allocation is classless. Note also the private address ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) reserved for internal networks and never routed on the public Internet — these are the addresses NAT translates.

The idea of IPv6

IPv6 was designed chiefly to end IPv4 address exhaustion. It uses a 128-bit address — written as eight groups of four hexadecimal digits separated by colons, e.g. 2001:0db8:85a3:0000:0000:8a2e:0370:7334. The address space is staggering:

2^{128} \approx 3.4 \times 10^{38}

— enough to give every grain of sand on Earth its own address many times over, removing scarcity as a design constraint entirely. IPv6 also allows convenient shorthand: leading zeros in a group may be dropped, and one run of all-zero groups may be replaced by :: (so the address above shortens to 2001:db8:85a3::8a2e:370:7334). Beyond sheer size, IPv6 brings a simpler fixed-length header (faster for routers to process), built-in support for autoconfiguration, and mandatory support for the IPSec security extensions.

Feature	IPv4	IPv6
Address length	32 bits	128 bits
Notation	Dotted-decimal	Hexadecimal, colon-separated
Address space	$\approx 4.3\times 10^{9}$	$\approx 3.4\times 10^{38}$
Header	Variable length, more fields	Fixed length, streamlined
Autoconfiguration	Via DHCP	Built in (SLAAC) plus DHCPv6
Need for NAT	Heavy, to conserve addresses	Largely unnecessary

For the exam, the key points are: why IPv6 exists (IPv4 exhaustion), its size (128 bits, hence an astronomically larger space), and its hexadecimal colon notation. You are not expected to do IPv6 arithmetic.

Ports and Sockets

An IP address gets a packet to the right host, but a single host runs many network programs at once — a web server, a mail server, an SSH daemon — and the transport layer must deliver each segment to the correct one. It does this with port numbers.

A port is a 16-bit number (0–65535) identifying a particular service or connection endpoint on a host. Ports are grouped into three ranges:

Range	Name	Use
0–1023	Well-known ports	Standard services: HTTP 80, HTTPS 443, FTP 20/21, SSH 22, SMTP 25, DNS 53
1024–49151	Registered ports	Specific applications, e.g. MySQL 3306
49152–65535	Dynamic / ephemeral ports	Temporary ports the OS allocates to the client end of a connection

A socket is the combination of an IP address and a port number, written address:port (for example 192.168.1.100:443). A socket identifies one endpoint of a connection. A full TCP connection is therefore identified by a pair of sockets — the client socket and the server socket — which is why one server on port 443 can hold thousands of simultaneous connections: each connection is distinguished by the client's unique IP-and-ephemeral-port combination, even though the server end is identical.

graph LR
    C["Client<br/>203.0.113.7 : 51322<br/>(ephemeral port)"] -->|"connection = pair of sockets"| S["Server<br/>93.184.216.34 : 443<br/>(well-known port for HTTPS)"]

This is exactly why the application protocols in the previous lesson each had a "default port" — port 80 is the door behind which a web server conventionally listens, port 25 the door for mail, and so on. The IP address finds the building; the port number finds the right door within it.

TCP versus UDP

Both TCP and UDP live at the transport layer, and an application chooses one of them. They embody opposite trade-offs: TCP buys reliability at the cost of overhead and latency; UDP buys speed and simplicity at the cost of guarantees.

TCP — Transmission Control Protocol

TCP is connection-oriented and reliable. Before any data flows, the two ends perform a three-way handshake to establish the connection:

sequenceDiagram
    participant C as Client
    participant S as Server
    C->>S: SYN (synchronise — let's connect)
    S->>C: SYN-ACK (acknowledge + my own synchronise)
    C->>S: ACK (acknowledged — connection open)
    Note over C,S: Reliable, ordered data transfer begins

Once connected, TCP guarantees delivery and order: every segment carries a sequence number so the receiver can reassemble data in the right order and detect gaps; the receiver returns acknowledgements (ACKs) for data it has received; anything unacknowledged within a timeout is retransmitted. TCP also performs flow control (slowing the sender if the receiver's buffer is filling) and congestion control (backing off when the network is overloaded). All this machinery means TCP has more overhead and higher latency than UDP — but the application can simply assume a clean, ordered, complete byte stream.

UDP — User Datagram Protocol

UDP is connectionless and best-effort. There is no handshake, no acknowledgement, no retransmission, no ordering and no flow control — UDP simply wraps the data in a tiny 8-byte header (source/destination ports, length, checksum) and sends it. Datagrams may be lost, duplicated or arrive out of order, and UDP will not notice or care. In return it is fast, has minimal overhead, and adds no setup delay — ideal where speed matters more than perfection, or where the application handles reliability itself.

Comparison and when to use each

Feature	TCP	UDP
Connection	Connection-oriented (3-way handshake)	Connectionless (no handshake)
Reliability	Guaranteed — lost segments retransmitted	Best-effort — no retransmission
Ordering	Yes — sequence numbers reorder data	No ordering
Flow / congestion control	Yes	No
Header size	20 bytes minimum	8 bytes
Speed / overhead	Slower, more overhead	Faster, minimal overhead
Setup delay	Yes (handshake first)	None
Typical uses	Web (HTTP/HTTPS), email (SMTP), file transfer (FTP), SSH	DNS lookups, live voice/video (VoIP, streaming), online gaming

TCP/IP in Detail

TCP/IP in Detail

Spec Mapping

The TCP/IP Four-Layer Model

Encapsulation Down the Stack

IP Addressing

IPv4 structure

IPv4 address classes

The idea of IPv6

Ports and Sockets

TCP versus UDP

TCP — Transmission Control Protocol

UDP — User Datagram Protocol

Comparison and when to use each

More in Computer Science