Protocols in Detail

A protocol is a set of agreed rules governing how data is exchanged between devices. At A-Level you must know not merely the names of the key protocols but how they work, what messages they exchange, on which transport (TCP or UDP) they run, and — crucially — why standards matter so that equipment and software from different vendors interoperate. This lesson examines the principal application-layer protocols in depth and revisits the underlying TCP versus UDP choice and the request/response model that so many of them share.

Each protocol exists to solve one specific communication problem — fetching a web page, moving a file, sending mail, looking up a name, configuring a device — and each defines its own vocabulary of messages and its own expected responses. As you read, try to fix three facts in mind for every protocol: what job it does, which port it conventionally uses, and which transport (TCP or UDP) it runs over and why. Those three facts are exactly what short-answer questions tend to probe, and the "why" in the third is where the higher marks sit.

Spec Mapping

This lesson addresses the Communication and networking area of the AQA A-Level Computer Science (7517) specification, specifically network protocols and standards. It covers why standards exist and the benefits of adopting them; the application-layer protocols HTTP, HTTPS, FTP, SMTP, POP3 and IMAP; the lower-level transport choice between TCP and UDP; and the request/response model of client–server communication. It builds on the TCP/IP protocol stack content (locating each protocol at its layer) and links to encryption (HTTPS/TLS) and web technologies.

Why Standards Matter

A standard is a published, agreed specification that everyone implements the same way. Protocols are standards. Their importance is frequently examined, so be ready to explain it:

Interoperability: a web browser made by one company can talk to a web server made by another because both implement HTTP identically. Without an agreed standard, every vendor pair would need a custom adaptor.
Vendor independence: organisations are not locked into a single manufacturer; equipment can be mixed and replaced.
Innovation and competition: many companies can build products on a common foundation, driving down prices and improving quality.
Longevity and scale: the Internet works because billions of independently built devices all obey the same protocols.

Standards are maintained by bodies such as the IETF (which publishes protocol specifications as RFCs) and the W3C (web standards). You do not need to memorise the bodies, but you should grasp that protocols are open, published rule-sets — not the private property of one company.

A short thought experiment shows why this matters. Imagine each manufacturer invented its own secret web protocol: a browser from one company could only talk to web servers from the same company, the web would fragment into incompatible islands, and competition would collapse into a few walled gardens. Open standards prevent exactly this — they are the reason a thirty-year-old idea (HTTP) still works across billions of independently built devices, and why a new device manufacturer can join the internet simply by implementing the published protocols. Standards are, in a real sense, the social contract that makes a global network possible.

The Request/Response Model

Most application protocols here follow a request/response pattern: a client sends a request, a server processes it and returns a response.

sequenceDiagram
    participant C as Client
    participant S as Server
    C->>S: Request (e.g. GET /index.html)
    S->>S: Process the request
    S-->>C: Response (e.g. 200 OK + content)

This simple model underlies HTTP, FTP commands, and the query/response of DNS. Keeping it in mind makes each protocol below easier to reason about.

The model has important consequences. First, it is asymmetric: the client initiates and the server responds, never the reverse — a server does not normally push unsolicited data to a client (techniques like long-polling and WebSockets were invented precisely to work around this). Second, it pairs naturally with the client-server architecture studied elsewhere in this area: the protocols here are the application-layer realisation of that architecture, with the server holding the resource and many clients requesting it. Third, the request and response are usually formatted as text (in HTTP, SMTP and FTP commands) so they are human-readable and easy to debug — you can literally read an HTTP exchange. Holding these three observations in mind turns a list of protocols into a coherent family that all behave in recognisably similar ways.

HTTP and HTTPS

HTTP (HyperText Transfer Protocol)

HTTP requests and delivers web pages and resources (images, scripts, stylesheets). It runs over TCP on port 80 (because a web page must arrive complete and in order, reliable TCP is the right transport — a missing chunk of HTML or a corrupted image would render the page broken, so best-effort UDP would be unacceptable here).

A request/response trace:

// 1. Client opens a TCP connection to the server on port 80,
//    then sends an HTTP request:
GET /index.html HTTP/1.1
Host: www.example.com
Accept: text/html

// 2. Server processes the request and replies with an HTTP response:
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 142

<html> ... page content ... </html>

// 3. Connection is reused for further resources (HTTP/1.1 keep-alive)
//    or closed.

Common HTTP methods:

Method	Purpose
GET	Retrieve a resource
POST	Submit data to the server (e.g. a form)
PUT	Update/replace a resource
DELETE	Remove a resource

Common HTTP status codes:

Code	Meaning
200	OK — request succeeded
301	Moved Permanently — resource has a new URL
403	Forbidden — access denied
404	Not Found — resource does not exist
500	Internal Server Error

The first digit groups the codes: 1xx informational, 2xx success, 3xx redirection, 4xx client error (the request was wrong), 5xx server error (the server failed). Knowing the family lets you reason about an unfamiliar code: a 418 must be a client-side issue, a 503 a server-side one. This is more useful in the exam than rote-memorising individual numbers.

Anatomy of the exchange. An HTTP message has a start line, a set of headers (one per line, Name: value), a blank line, then an optional body. In the request above, GET /index.html HTTP/1.1 is the start line, Host: and Accept: are headers, and a GET has no body. In the response, HTTP/1.1 200 OK is the start line, Content-Type and Content-Length are headers, and the HTML is the body. A POST request does carry a body — typically the form data being submitted — which is why POST is used for sensitive or large submissions rather than putting the data in the URL as GET does.

Stateless protocol. HTTP is stateless: each request is independent and the server remembers nothing about previous requests by default. This keeps servers simple and scalable, but means applications needing to "remember" a logged-in user must add state themselves — typically with cookies (a small token the browser stores and returns on each request) or session identifiers. This statelessness is a frequently examined property: it is why cookies exist.

HTTPS (HTTP Secure)

HTTPS is HTTP carried over an encrypted TLS (Transport Layer Security, successor to SSL) connection, running on port 443.

All data between client and server is encrypted, defeating eavesdropping (packet sniffing).
The server's identity is verified by a digital certificate issued by a trusted Certificate Authority (CA).
It prevents man-in-the-middle tampering — an attacker can neither read nor silently alter the data.
The browser shows a padlock when HTTPS is active.

Feature	HTTP	HTTPS
Port	80	443
Encryption	None (plaintext)	TLS-encrypted
Server authentication	None	Digital certificate (CA)
Protects against	—	Eavesdropping, tampering, MITM

How the secure channel is set up (the TLS handshake, in outline):

1. Client connects and requests a secure session.
2. Server sends its DIGITAL CERTIFICATE, containing its public key,
   signed by a trusted Certificate Authority (CA).
3. Client verifies the certificate using the CA's public key (built
   into the browser). If invalid or expired -> security warning.
4. Using the server's public key, the two sides securely agree a shared
   SYMMETRIC session key (asymmetric crypto is used only for this step).
5. The rest of the session is encrypted with the fast symmetric key.

This is a neat real-world application of hybrid encryption: slow asymmetric cryptography is used briefly to agree a key and authenticate the server, then fast symmetric cryptography protects the bulk of the data. The certificate step is what defeats a man-in-the-middle: an attacker cannot forge a certificate the CA has signed, so the browser will not trust an impostor server. This directly links the protocol world to the encryption and digital-signatures content.

Putting It Together: What Really Happens When You Visit a Website

A single page load exercises several protocols in concert — a synoptic favourite. Here is the end-to-end sequence when you type https://www.example.com and press Enter:

1. DNS  : Browser resolves www.example.com to an IP address (UDP, port 53),
          checking caches first, then asking resolver/root/TLD/authoritative.
2. TCP  : Browser opens a TCP connection to that IP on port 443
          (three-way handshake: SYN, SYN-ACK, ACK).
3. TLS  : Client and server perform the TLS handshake -- certificate check,
          agree a symmetric session key. Channel is now encrypted.
4. HTTP : Browser sends "GET / HTTP/1.1" inside the encrypted channel.
5. HTTP : Server returns "200 OK" plus the HTML.
6. HTTP : Browser parses the HTML, then issues further requests (often over
          the same connection) for CSS, JavaScript and images.
7. RENDER: Browser renders the page once enough resources have arrived.

Notice how each protocol does exactly its own job: DNS turns the name into an address, TCP provides a reliable connection, TLS secures it, and HTTP carries the actual request and response. If any step fails you get a recognisable symptom — a DNS failure gives "server not found", a refused TCP connection gives "can't connect", a bad certificate gives a security warning, and a 404 from HTTP gives "page not found". Being able to narrate this sequence, naming the protocol and transport at each step, is precisely the kind of integrated understanding that scores at the top band.

FTP (File Transfer Protocol)

FTP transfers files between a client and server over TCP, using two connections:

Connection	Port	Purpose
Control	21	Carries commands (login, directory listing, file operations)
Data	20 (or negotiated)	Carries the actual file contents

Key points:

Classic FTP sends credentials and data in plaintext — insecure on untrusted networks.
Secure alternatives are SFTP (file transfer over SSH) and FTPS (FTP over TLS).
FTP supports ASCII and binary transfer modes.

Why two connections? The separation lets commands flow on the control connection even while a large file is mid-transfer on the data connection — for instance, a client can send an "abort" command without waiting for the transfer to finish. The downside is that the dynamically negotiated data connection complicates firewalls and NAT, which is one reason single-connection secure alternatives like SFTP have largely supplanted classic FTP. The plaintext weakness is the headline exam point: anyone capturing the packets sees the username, password and file contents, which is unacceptable over the public internet and the direct motivation for SFTP/FTPS.

Email Protocols: SMTP, POP3, IMAP

Email uses different protocols for sending and retrieving — a distinction examiners love to test.

SMTP (Simple Mail Transfer Protocol) — Sending

SMTP sends mail from a client to its mail server, and between mail servers. It runs over TCP on port 25 (or 587 for authenticated client submission).

1. User composes a message and the client submits it to its outgoing SMTP server.
2. The SMTP server reads the recipient's domain and looks up that domain's
   mail server via DNS  (an MX record).
3. SMTP relays the message to the recipient's mail server.
4. The message waits in the recipient's mailbox on that server.
5. The recipient later RETRIEVES it using POP3 or IMAP (not SMTP).

POP3 and IMAP — Retrieving

Protocol	Port	Behaviour
POP3	110	Downloads messages to one device and, by default, deletes them from the server. Simple; suits single-device access; offline-friendly.
IMAP	143	Messages stay on the server and are synchronised across many devices, with server-side folders and read/unread state. Suits users on phone + laptop + desktop.

The Full Email Journey

It helps to see which protocol carries each leg of an email from Alice to Bob:

flowchart LR
    A["Alice's<br/>mail client"] -- "SMTP (send)" --> AS["Alice's<br/>mail server"]
    AS -- "SMTP (relay)" --> BS["Bob's<br/>mail server"]
    BS -- "POP3 / IMAP<br/>(retrieve)" --> B["Bob's<br/>mail client"]

The key insight: SMTP carries the message outbound and between servers, while POP3 or IMAP carries it the last leg from Bob's server to Bob's client. Email therefore deliberately uses different protocols for the push (sending) and pull (collecting) halves of the trip, because the requirements differ — relaying between always-on servers versus delivering to an intermittently-connected personal device.

Exam Tip: A classic question contrasts POP3 and IMAP. The discriminator is where the mail lives: POP3 typically pulls mail down and removes it from the server (one device, no sync); IMAP keeps mail on the server and synchronises state across devices. Note that SMTP sends; POP3/IMAP retrieve — confusing these is a common error.

Protocols in Detail (HTTP, HTTPS, FTP, SMTP, SSH, DNS, DHCP)