You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
The World Wide Web is the most visible service running on the Internet, and it is built from a small set of cooperating technologies. This lesson explains the three languages that every web page is made from — HTML for structure, CSS for presentation and JavaScript for behaviour — and, crucially, the distinction between client-side processing (run in the user's browser) and server-side processing (run on the web server). It then turns to how search engines make the web usable: how they crawl and index billions of pages, and the PageRank idea that originally ranked results by treating links as votes, which we develop with a little maths.
Two themes recur. The first is separation of concerns: HTML, CSS and JavaScript each do one job, just as the network layers in earlier lessons each did one job — keeping structure, style and behaviour apart makes a site maintainable. The second is where work happens: deciding what runs on the client versus the server is one of the central design choices in web development, with real consequences for speed, security and capability. Hold those two themes and the rest follows.
This lesson addresses the OCR H446 1.3.4 material on web technologies, specifically:
Examiners reward candidates who can justify a client-versus-server decision from a scenario, and who can explain PageRank as more than "counts links" — capturing the idea that the importance of the linking page matters.
A modern web page is built from three technologies working together, each responsible for a different aspect of the page. The cleanest way to remember them is by what they are:
| Technology | Responsibility | Analogy |
|---|---|---|
| HTML | Structure & content — the headings, paragraphs, lists, images, links | The skeleton and organs |
| CSS | Presentation — colours, fonts, spacing, layout, responsive design | The clothing and appearance |
| JavaScript | Behaviour — interactivity, dynamic updates, responding to the user | The muscles and reflexes |
Keeping these three separate is a deliberate discipline called separation of concerns: the same structural HTML can be restyled entirely by swapping the CSS, and behaviour can be added or changed in the JavaScript without disturbing either. This is the same modularity principle seen throughout computer science — each part has one job and a clean boundary with the others.
HTML (HyperText Markup Language) is a markup language: it uses tags in angle brackets to label the parts of a document so the browser knows what each piece is. An element is usually an opening tag, some content, and a closing tag; tags may carry attributes giving extra information, and elements nest inside one another to form the page's tree-like structure (the Document Object Model, or DOM).
<!DOCTYPE html>
<html>
<head>
<title>My Page</title>
<link rel="stylesheet" href="styles.css">
</head>
<body>
<h1>Welcome</h1>
<p>This is a <a href="https://example.com">link</a>.</p>
<img src="photo.jpg" alt="A description">
</body>
</html>
Note that HTML says nothing about how things look — <h1> means "this is the top-level heading", not "make this big and bold". How it appears is the job of CSS. Modern HTML also uses semantic tags such as <header>, <nav>, <article> and <footer> that describe the meaning of a region, which helps both accessibility tools and — importantly for later — search engines understand the page.
CSS (Cascading Style Sheets) controls how the HTML looks. A CSS rule has a selector (which elements to target) and one or more declarations of a property and a value:
h1 {
color: navy;
font-size: 32px;
}
.highlight {
background-color: yellow;
}
Selectors can target elements by tag (h1), by class (.highlight) or by id (#main). The word cascading refers to how, when several rules could apply to one element, specificity and source order decide which wins. Styles can live inline (in a style attribute), internally (in a <style> block in the <head>) or — best practice — in an external .css file linked from the HTML. The external approach is preferred because one stylesheet can format an entire site, the browser caches it so subsequent pages load faster, and a site-wide redesign means editing one file rather than every page. These are the concrete pay-offs of separating presentation from structure.
CSS is also where responsive design lives — the technique that lets the same HTML adapt its layout to screens of wildly different sizes, from a phone to a wide desktop monitor. Using media queries, a stylesheet can apply different rules depending on the viewport width, for example stacking navigation links vertically on a narrow phone but laying them out in a horizontal bar on a wide screen:
nav { display: flex; flex-direction: row; } /* wide screens: horizontal */
@media (max-width: 600px) {
nav { flex-direction: column; } /* narrow screens: stacked */
}
This is the separation principle paying off once more: because layout is expressed in CSS rather than baked into the HTML, one set of structural markup can serve every device, with the presentation re-flowing to suit each — there is no need for a separate "mobile site" with duplicated content to keep in sync.
JavaScript is a full programming language that runs in the browser to make pages interactive and dynamic. Where HTML is static structure and CSS is static style, JavaScript responds to events and changes the page after it has loaded — validating a form before it is sent, updating part of a page without a full reload, animating elements, or manipulating the DOM to add and remove content.
// Validate an email field before the form is submitted
const form = document.querySelector("#signup");
form.addEventListener("submit", (event) => {
const email = document.querySelector("#email").value;
if (!email.includes("@")) {
event.preventDefault(); // stop the form submitting
alert("Please enter a valid email address.");
}
});
The example above runs entirely on the user's machine: the moment they click submit, the check happens instantly with no trip to the server. That immediacy is the great strength of running code client-side — and it leads directly to the central distinction of this lesson.
It is worth seeing how HTML, CSS and JavaScript come together in the browser, because the order of events explains a great deal about how web pages behave. When the browser receives the HTML response, it parses the markup into a tree of objects called the Document Object Model (DOM) — an in-memory representation of the page's structure that programs can read and change. As it parses, it requests the linked resources: the CSS file (so it knows how to paint each element) and any JavaScript files. The CSS is applied to the DOM to produce the rendered appearance, and the JavaScript is then free to manipulate the DOM — adding, removing or altering elements — and to respond to events such as clicks and key presses. This is why JavaScript can change a page after it has loaded without fetching a new one: it is editing the live DOM that the browser is already displaying. The three technologies therefore operate on the same underlying document — HTML built it, CSS painted it, JavaScript animates and updates it — which is precisely why keeping them as separate, well-defined layers (structure, presentation, behaviour) makes a site so much easier to reason about and maintain. A developer changing the look need touch only the CSS; one adding a feature need touch only the JavaScript; and the structural HTML can stay stable beneath both.
A short clarification on the World Wide Web versus the Internet, a distinction examiners like to test: the Internet is the global network of networks — the physical and logical infrastructure of cables, routers and the TCP/IP protocols studied in earlier lessons. The World Wide Web is just one service that runs on top of the Internet: the interlinked system of pages, identified by URLs (Uniform Resource Locators) and delivered by HTTP/HTTPS. Email, file transfer and video calls are other services on the same Internet; the Web is not the Internet itself but an application built on it. A URL such as https://www.example.com/products/123 bundles together the scheme (https, the protocol), the host (www.example.com, resolved to an IP address by DNS) and the path (/products/123, the specific resource on that server) — three pieces that, between them, name exactly one resource anywhere on the Web.
A web application can run its processing in one of two places, and choosing correctly is a core skill. Client-side processing runs in the user's browser (JavaScript); server-side processing runs on the web server (in a language such as Python, PHP or Node.js) before a response is sent back.
graph LR
subgraph Client["Client (browser)"]
JS["JavaScript:<br/>validation, animation,<br/>DOM updates, instant feedback"]
end
subgraph Server["Web server"]
SS["Server-side script:<br/>database queries, authentication,<br/>payment processing, business logic"]
DB[("Database")]
SS --- DB
end
JS -->|HTTP request| SS
SS -->|HTTP response (HTML/JSON)| JS
| Aspect | Client-side (JavaScript in browser) | Server-side (script on server) |
|---|---|---|
| Runs on | The user's device | The web server |
| Speed/latency | Instant — no network round-trip | Slower — needs a request and response |
| Server load | Offloads work to the client | Consumes server CPU/memory |
| Access to data | Cannot reach the server's database or files directly | Full access to databases, files, secrets |
| Security of code | Code is downloaded and visible to the user; can be tampered with | Code is hidden on the server; the user never sees it |
| Trust | Cannot be trusted — the user controls the environment | Trusted — runs in the organisation's control |
| Typical jobs | Form validation, animation, interactivity, immediate feedback | Authentication, database access, payments, anything sensitive |
The decision turns on what the work needs. Put it client-side when responsiveness matters and the task needs nothing secret: validating that a field is not empty, animating a menu, updating a counter. Put it server-side when the task needs trusted data or trusted logic: checking a password, reading or writing the database, taking a payment, or enforcing a business rule the user must not bypass.
The single most important exam point here is that client-side validation can never be trusted on its own. Because client-side JavaScript is downloaded to the user's machine, a malicious user can disable it, edit it, or simply send a crafted request straight to the server bypassing the page entirely. Client-side validation exists only to give honest users instant feedback and reduce needless server round-trips — it is a convenience, not a security control. Therefore any check that matters for security or data integrity must be repeated server-side, where the user cannot interfere. Failing to re-validate on the server is the classic vulnerability behind much of the SQL-injection material in the security lesson: trusting input merely because the page's JavaScript "checked" it.
When a request needs dynamic content, the server does work before replying. A typical flow:
The same machinery underlies web APIs, where instead of a whole HTML page the server returns structured data — usually JSON (JavaScript Object Notation), a lightweight text format of key–value pairs that JavaScript can parse directly:
// A client-side fetch of JSON from a server-side API endpoint
const response = await fetch("/api/users/123");
const user = await response.json();
console.log(user.name); // e.g. "Alice"
Such APIs are commonly RESTful: each resource has a URL, standard HTTP methods act on it (GET to read, POST to create, PUT/PATCH to update, DELETE to remove), and each request is stateless — it carries everything the server needs, with no session remembered between requests. The server signals the outcome with an HTTP status code: 200 OK, 201 Created, 301 redirect, 400 bad request, 401 unauthorised, 403 forbidden, 404 not found, 500 server error. (HTTPS, the encrypted form of HTTP, was covered in the protocols lesson and is taken further in network-security.)
The web has billions of pages; a search engine makes them findable. It does three jobs: crawling, indexing and ranking.
graph LR
Crawl["1. CRAWLING<br/>bots ('spiders') follow links,<br/>fetching pages across the web"] --> Index["2. INDEXING<br/>build an inverted index:<br/>word → list of pages containing it"]
Index --> Rank["3. RANKING<br/>order matching pages by<br/>relevance + importance (PageRank)"]
Rank --> Results["Results page shown to user"]
A search engine runs automated programs called web crawlers or spiders. Starting from known pages, a crawler follows the hyperlinks it finds, fetching each new page and discovering yet more links — a breadth-first exploration of the web's link graph. Crawlers respect a site's robots.txt file, which tells them which areas not to fetch. Because the web changes constantly, crawling is continuous: pages are re-visited so the index stays reasonably fresh.
Storing whole pages and scanning them for every query would be hopelessly slow, so the engine builds an inverted index — a structure that maps each word to the list of pages containing it (and often where in the page the word appears). This is the same idea as a book's index: rather than reading the whole book to find a term, you look the term up and it points you straight to the relevant pages. When you search for a word, the engine consults the inverted index and instantly retrieves every page containing it, rather than searching the web live. Semantic HTML, page titles and headings help the indexer judge what a page is about.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.