SQL Joins

Normalisation deliberately split your data across many tables so each fact lives once; joins are how you put it back together to answer a real question. A join recombines rows from two or more tables by matching a foreign key in one against the primary key in another. This lesson works through the join types OCR examines — INNER, LEFT and RIGHT OUTER — against small sample tables, showing not just each query in a language-tagged sql code fence but the exact result set it produces and why each row is present or absent. Joins are where normalisation and SQL meet, and writing a correct multi-table query is one of the highest-value skills in the whole 1.3.2 topic.

Spec Mapping

This lesson addresses the H446 1.3.2 Databases content on combining tables:

Use an INNER JOIN to return only rows that have a matching key in both tables.
Use LEFT and RIGHT OUTER JOIN to keep all rows from one side, filling unmatched columns with NULL.
Write the join condition with ON, matching a foreign key to the primary key it references, and join three or more tables in one query.
Combine joins with WHERE, ORDER BY and aggregation, and recognise the Cartesian product that results from omitting the join condition.

(Phrasing here paraphrases the specification content; it is not a verbatim quote.)

Why Joins Exist

A 3NF design stores students in one table and the subjects they take in another, linked by key. So the question "which subjects does Alice take?" cannot be answered from either table alone — the student's name is in Students and the subject taken is in Enrolments. A join recombines them on the shared StudentID. Every join therefore has two ingredients: the tables to combine (in FROM and JOIN) and the condition that says which rows correspond (in ON), almost always foreign key = primary key.

This is the direct pay-off of normalisation, and it is worth seeing the connection plainly. Normalisation deliberately removed redundancy by separating entities into different tables — but in doing so it scattered the pieces of any real-world answer across several tables. Joins are the operation that gathers those pieces back together at query time, without ever re-introducing the stored redundancy that normalisation eliminated. In other words, a relational database lets you have it both ways: store each fact once (no anomalies), yet still produce any combined view you need on demand. Without joins, a normalised database would be tidy but almost unusable; with them, it is both tidy and fully queryable. That is why joins are not an advanced extra but the everyday tool for reading a normalised database.

The sample tables for this lesson

Students (StudentID is the primary key):

StudentID	Name	Age
1	Alice	17
2	Bob	18
3	Charlie	17

Enrolments (a junction table; StudentID and SubjectID are foreign keys):

StudentID	SubjectID
1	CS
1	Maths
2	CS

Subjects (SubjectID is the primary key):

SubjectID	SubjectName	TeacherName
CS	Computer Science	Dr Smith
Maths	Mathematics	Ms Patel
English	English Literature	Mr Brown

Three facts about this data drive every result below: Charlie (3) has no enrolments, English has no enrolments, and Alice has two. Watch how each join type treats these unmatched rows.

The Anatomy of a Join

Before the specific join types, it helps to picture mechanically what a join does. Conceptually, to join Students with Enrolments on StudentID, the database considers each row of the left table against each row of the right table and keeps the pairs where the ON condition holds. For the two tables above, the matching pairs are:

Students row	Enrolments row	Match on StudentID?
Alice (1)	(1, CS)	yes → kept
Alice (1)	(1, Maths)	yes → kept
Bob (2)	(2, CS)	yes → kept
Charlie (3)	(no row has StudentID 3)	no match

So the combined intermediate result of "Students joined to Enrolments" looks like this — one row per matching pair, carrying columns from both tables:

StudentID	Name	Age	SubjectID
1	Alice	17	CS
1	Alice	17	Maths
2	Bob	18	CS

This intermediate table is what the rest of the query (SELECT, WHERE, ORDER BY) then operates on. The only difference between the join types is what happens to a row that finds no partner — like Charlie above. An inner join discards it; a left/right outer join keeps it and pads the missing side with NULL. Hold that single distinction in mind and every join result below becomes predictable.

INNER JOIN

An INNER JOIN returns only rows that have a match on the join condition in both tables. Any row on either side with no partner is dropped. Joining Students to Enrolments to Subjects to list "who takes what":

SELECT   s.Name, sub.SubjectName
FROM     Students s
INNER JOIN Enrolments e   ON s.StudentID  = e.StudentID
INNER JOIN Subjects   sub ON e.SubjectID  = sub.SubjectID;

Name	SubjectName
Alice	Computer Science
Alice	Mathematics
Bob	Computer Science

Reading the matching row by row: Alice (1) has enrolments in CS and Maths, both of which exist in Subjects, so she produces two output rows; Bob (2) has one enrolment (CS), giving one row. Charlie is absent because he has no row in Enrolments to match, and English is absent because no enrolment points at it. An inner join keeps only the data that lines up on both sides — the intersection.

Two details are worth internalising. First, an inner join can return more rows than the left table has, fewer, or the same, depending on the matching: Alice (one student) produced two rows because she matched two enrolments, so the output has three rows from two students. The row count of a join is governed by the matches, not by either table's size alone. Second, the order in which you write INNER JOIN clauses does not change the result of an inner join (it is associative and commutative in effect), though a good optimiser may still choose to evaluate them in whichever order is fastest — another instance of the declarative what versus the optimiser's how.

Exam Tip: INNER JOIN is the default expectation in most join questions. Picture a Venn diagram: an inner join returns only the intersection — rows present in both tables. If a row has no matching key on the other side, it does not appear.

LEFT OUTER JOIN

A LEFT JOIN (short for LEFT OUTER JOIN) returns every row from the left table, plus matching data from the right; where there is no match, the right table's columns come back as NULL. This is how you find "who takes nothing":

SELECT   s.Name, e.SubjectID
FROM     Students s
LEFT JOIN Enrolments e ON s.StudentID = e.StudentID;

Name	SubjectID
Alice	CS
Alice	Maths
Bob	CS
Charlie	NULL

Now Charlie appears even though he has no enrolment — the left table (Students) is preserved in full, and his missing right-hand columns are filled with NULL. This is precisely the query you would extend with WHERE e.SubjectID IS NULL to list the students with no enrolments at all — a classic use of an outer join that an inner join simply cannot express, because the inner join would have dropped Charlie.

Worked example: finding the unmatched rows

The "students who take nothing" question is so common it is worth seeing in full. Take the left join above and filter for the padded NULL:

SELECT   s.Name
FROM     Students s
LEFT JOIN Enrolments e ON s.StudentID = e.StudentID
WHERE    e.StudentID IS NULL;

Name
Charlie

The logic is subtle and worth spelling out: the LEFT JOIN keeps every student, putting NULL in the Enrolments columns for anyone with no match; the WHERE e.StudentID IS NULL then keeps only those padded rows — i.e. exactly the students who failed to match, which is the students with no enrolment. You could not write this with an INNER JOIN, because the inner join would already have removed Charlie before the WHERE ever ran. This "outer join, then test the key IS NULL" pattern is the standard way to answer any "find the records in A with no corresponding record in B" question — orphaned customers with no orders, products never sold, and so on.

RIGHT OUTER JOIN

A RIGHT JOIN (RIGHT OUTER JOIN) is the mirror image: it keeps every row from the right table, filling unmatched left-hand columns with NULL. To list every subject and any enrolled student:

SELECT   e.StudentID, sub.SubjectName
FROM     Enrolments e
RIGHT JOIN Subjects sub ON e.SubjectID = sub.SubjectID;

StudentID	SubjectName
1	Computer Science
2	Computer Science
1	Mathematics
NULL	English Literature

English appears with a NULL student, because the right table (Subjects) is preserved in full and English has no enrolment. Notice that CS produces two rows (students 1 and 2 are both enrolled) while Maths produces one and English produces a single padded row — the right table's rows are all present, repeated once per match and padded with NULL where there is none. This is exactly the query a curriculum manager would run to spot subjects with no take-up: extend it with WHERE e.StudentID IS NULL and only English remains.

A right join is exactly a left join with the tables written the other way round; A RIGHT JOIN B returns the same rows as B LEFT JOIN A. Because of this equivalence, any RIGHT JOIN can be rewritten as a LEFT JOIN simply by swapping the order of the two tables, which is why many practitioners — and many style guides — use LEFT JOIN throughout for consistency and read every outer join "the same way round". The exam expects you to recognise and be able to use a right join, but in your own queries you are free to express the same logic as a left join if you find it clearer.

Comparing the Join Types

Join	Keeps	Unmatched rows become
`INNER JOIN`	Only rows matching in both tables	(dropped)
`LEFT JOIN`	All left-table rows + matches	`NULL` on the right
`RIGHT JOIN`	All right-table rows + matches	`NULL` on the left
`FULL OUTER JOIN`	All rows from both tables	`NULL` on whichever side is missing

Thinking of two overlapping circles A (left) and B (right): an inner join returns the overlap only; a left join returns all of A plus the overlap; a right join returns all of B plus the overlap; and a full outer join returns everything in either circle. (FULL OUTER JOIN is beyond the core of most join questions but completes the family, and is useful for "show me every student and every subject, matched where possible".) The Venn-diagram picture is the single most useful mental model for this topic: when you read a question, decide which region of the two circles it is asking for, and the correct join follows immediately — overlap means inner, "all of one side" means a left or right outer, and "everything from both" means full outer.

Choosing the right join from the question

In the exam the wording of the question tells you which join to reach for. The deciding question is always: must unmatched rows be kept, and if so from which side?

The question asks for…	Use	Why
"students and the subjects they take" (only those with subjects)	`INNER JOIN`	Only matched rows are wanted; unmatched students/subjects are irrelevant.
"every student, with their subjects if any"	`LEFT JOIN` (Students on the left)	All students must appear even with no enrolment.
"every subject, with enrolled students if any"	`RIGHT JOIN` (or `LEFT` with Subjects first)	All subjects must appear even with no student.
"students who take no subject"	`LEFT JOIN` + `WHERE … IS NULL`	Keep all students, then pick the unmatched ones.
"everyone and everything, matched where possible"	`FULL OUTER JOIN`	Keep unmatched rows from both sides.

The give-away words are "every", "all", "including those with none", and "with no…". Their presence means an outer join; their absence usually means an ordinary inner join. Reading for these phrases first, before writing any SQL, prevents the most common conceptual mistake — using an inner join when the question clearly wants the unmatched rows kept.

The Cartesian Product — What Happens Without ON

The join condition is not optional decoration. If you join two tables with no ON clause (or a condition that is always true), the database produces a Cartesian product (also called a cross join): every row of the first table paired with every row of the second. Joining 3 students with 3 subjects this way yields $3 \times 3 = 9$ meaningless rows — Alice paired with English, Bob paired with Maths, and so on, regardless of who actually takes what.

-- DANGER: no ON condition -> Cartesian product (9 rows of nonsense)
SELECT s.Name, sub.SubjectName
FROM   Students s, Subjects sub;

SQL Joins

SQL Joins

Spec Mapping

Why Joins Exist

The sample tables for this lesson

The Anatomy of a Join

INNER JOIN

LEFT OUTER JOIN

Worked example: finding the unmatched rows

RIGHT OUTER JOIN

Comparing the Join Types

Choosing the right join from the question

The Cartesian Product — What Happens Without ON

More in Computer Science