The Genetic Code — Triplet, Degenerate, Non-Overlapping, Universal

The genetic code is the set of rules by which information encoded in the sequence of DNA bases is translated into sequences of amino acids in proteins. It is one of the most elegant and universal features of biology. This lesson covers the OCR A-Level Biology A specification point 2.1.3 (f) — the nature of the genetic code, including the key descriptors: triplet, degenerate, non-overlapping and (nearly) universal.

1. Why a Triplet Code?

There are four bases in DNA (A, T, C, G) and 20 amino acids commonly found in proteins. The minimum number of bases required to specify one amino acid — assuming a fixed number per amino acid — can be worked out using powers of 4:

Bases per codon	Possible combinations	Enough for 20 amino acids?
1	4¹ = 4	No — only 4 amino acids
2	4² = 16	No — only 16 amino acids
3	4³ = 64	Yes — with room to spare
4	4⁴ = 256	Yes, but wastefully so

A triplet code — three bases per amino acid — is therefore the smallest workable unit that can specify all 20 amino acids. Evolution has converged on this solution: every cell in every organism on Earth uses a three-base codon.

Key Definition — Codon: A triplet of bases in mRNA (or a gene) that codes for a single amino acid (or for starting/stopping translation).

2. The Four Properties of the Genetic Code

The OCR specification requires you to understand four specific descriptors of the genetic code:

2.1 Triplet

Each amino acid is specified by a sequence of three consecutive bases — a codon on mRNA (or a triplet on the coding strand of DNA). There are 64 possible codons (4 × 4 × 4).

5' → 3' mRNA codon	AUG	GCA	UUU	UGG	UAA
Amino acid	Met	Ala	Phe	Trp	STOP

2.2 Degenerate

With 64 codons but only 20 amino acids, there are more codons than amino acids. This means:

Most amino acids are coded by more than one codon.
For example, leucine is coded by six different codons (CUU, CUC, CUA, CUG, UUA, UUG).
Only two amino acids (methionine and tryptophan) are coded by just one codon each.
Three codons (UAA, UAG, UGA) do not code for an amino acid but act as stop codons, signalling the end of translation.
AUG acts as the start codon and also codes for methionine.

Key Definition — Degenerate code: A code in which more than one codon can specify the same amino acid.

Biological importance of degeneracy: Because several codons code for the same amino acid, some point mutations (changes in a single base) are "silent" — they produce a different codon that still codes for the same amino acid. The degenerate code therefore provides some resistance to the harmful effects of mutations.

2.3 Non-Overlapping

The bases of the genetic code are read in sequence, one codon after another, without any overlap between codons. Each base belongs to only one codon; each codon is read separately.

Correct reading (non-overlapping — used in life): the mRNA bases are read as a sequence of adjacent, non-overlapping triplets. Each base belongs to exactly one codon.

Codon 1	Codon 2	Codon 3	Codon 4
AUG	UUC	GCA	UGA
Met	Phe	Ala	STOP

Hypothetical overlapping code (NOT used): if codons could share bases, the same nucleotide sequence would generate a different (interleaved) set of codons. Biology does not use this scheme — overlapping codes would constrain mutation tolerance and force most adjacent amino-acid pairs to be biochemically related, which is not observed.

Codon 1	Codon 2	Codon 3	Codon 4
AUG	UGU	GUU	UUC

Consequence: the reading frame matters enormously. If a base is inserted or deleted (a frameshift mutation), every codon downstream of the change is read in a different frame and the amino acid sequence becomes entirely wrong. This is much more damaging than a single-base substitution.

Exam Tip: Questions often ask about the effects of different types of mutation. A substitution changes only one codon (and may be silent if the code is degenerate). An insertion or deletion causes a frameshift, changing every subsequent codon — nearly always catastrophic.

2.4 Universal

The same codons specify the same amino acids in (almost) all organisms — bacteria, archaea, plants, fungi and animals.

AUG codes for methionine in humans, in E. coli, in Arabidopsis thaliana and in yeast.
UAA is a stop codon in all of these.

Biological significance:

Evidence for evolution — all life shares a common ancestor that used this code.
Genetic engineering — a human gene (e.g. for insulin) can be inserted into bacteria and transcribed and translated correctly, because the bacteria interpret the codons in exactly the same way. This underpins the entire biotechnology industry.

Exam Tip: Be careful with "universal". There are a few very minor exceptions — for example, mitochondria and some unicellular organisms have small variations. At A-Level you can describe the code as "nearly universal" or "universal with a few rare exceptions" for an extra mark of precision.

3. Summary Table of the Four Properties

Property	Meaning	Consequence
Triplet	Three bases per amino acid	64 codons, enough to code for all 20 amino acids
Degenerate	Most amino acids coded by more than one codon	Some point mutations are silent
Non-overlapping	Each base belongs to only one codon	Insertions/deletions cause damaging frameshifts
Universal	The same codons mean the same amino acids in nearly all organisms	Evidence for evolution; enables genetic engineering

graph LR
  A[Genetic code] --> B[Triplet]
  A --> C[Degenerate]
  A --> D[Non-overlapping]
  A --> E[Universal]
  B --> F["64 codons<br/>code for 20 amino acids"]
  C --> G[Most AAs have multiple codons]
  D --> H[Reading frame critical]
  E --> I[Enables genetic engineering]

4. Start and Stop Codons

Start codon: AUG — also codes for the amino acid methionine (Met). Every mRNA begins translation at an AUG. Not every AUG is a start codon, however — context matters.
Stop codons: UAA, UAG, UGA — these do not code for amino acids and instead signal the ribosome to release the completed polypeptide.

5. Reading the Genetic Code

Given a short mRNA sequence, you should be able to identify the amino acids it codes for using a codon table.

Example: 5' — AUGGCAUUUUGGUAA — 3'

Split into codons starting at the AUG:

AUG → Met (start)
GCA → Ala
UUU → Phe
UGG → Trp
UAA → STOP

Polypeptide: Met–Ala–Phe–Trp.

6. Common Exam Mistakes

Forgetting that the code is "triplet" — thinking each base codes for an amino acid.
Confusing "degenerate" with "redundant" or "random". The code is not random; it is degenerate in the sense that more than one codon can code for the same amino acid.
Saying that "every organism has the same DNA". The code is universal — the sequences are obviously different.
Writing that the genetic code is on DNA only. The code is expressed in both DNA (template) and mRNA (codons); the codons used in the table are usually written in mRNA form (with U, not T).
Mixing up the number of codons (64) and amino acids (20).
Writing "degenerate" as a criticism. It is a strength, providing tolerance to mutation.

7. Exam-Style Questions

State what is meant by the terms "triplet", "degenerate", "non-overlapping" and "universal" when applied to the genetic code. (4)
Explain why the genetic code must be at least triplet. (2)
A point mutation in a gene changes a single base but does not change the amino acid sequence of the protein. Explain how this is possible. (2)
Explain why a frameshift mutation is usually more harmful than a substitution mutation. (3)

Model answer for (3): "The genetic code is degenerate, meaning that most amino acids are coded for by more than one codon. A point mutation may change one codon to another codon that codes for the same amino acid, so the amino acid sequence of the protein is unchanged. This is called a silent mutation."

Summary

The genetic code specifies how the sequence of DNA bases is translated into the sequence of amino acids in a protein.
It is triplet: three bases = one codon = one amino acid. There are 64 codons in total.
It is degenerate: most amino acids are specified by more than one codon, providing resistance to silent mutations.
It is non-overlapping: each base belongs to only one codon, so the reading frame is critical.
It is (nearly) universal: the same codons specify the same amino acids in almost all organisms — evidence of common ancestry and the basis of genetic engineering.
Special codons: AUG (start/Met) and UAA, UAG, UGA (stop).

A-Level Deep Dive

Spec mapping

Spec Mapping: This lesson is mapped to OCR H420 Module 2.1.3 — Nucleotides and nucleic acids, covering the nature of the genetic code (triplet, degenerate, non-overlapping, universal) and the relationship between gene base sequence and polypeptide amino-acid sequence (refer to the official OCR H420 specification document for exact wording).

The genetic code links DNA structure (Lesson 2) to protein synthesis (Lessons 6–7) and is the conceptual bridge between molecular biology and the genetics topics of Module 6.1. The four code properties — triplet, degenerate, non-overlapping, (nearly) universal — are essential AO1 mark points. Questions about mutations (Module 6.1) hinge on understanding the consequences of code properties: degeneracy allows silent mutations; non-overlap means insertions and deletions cause frameshifts; universality is why genetic engineering between species is possible.

Scientists and paradigms

The genetic code was deciphered through a coordinated experimental programme between 1961 and 1966.

Marshall Nirenberg and Heinrich Matthaei (1961, NIH) created a cell-free protein-synthesis system and added synthetic poly-U RNA, obtaining a polypeptide composed exclusively of phenylalanine. This established that the codon UUU codes for Phe — the first codon assignment. The school of thought to take into the exam: "synthetic templates of known sequence let you read out the code one codon at a time".

Har Gobind Khorana (1960s) used chemically synthesised RNAs of defined repeating sequence (e.g. poly-UC, alternating U and C) to assign codons whose sequence Nirenberg's random copolymers could not isolate. By 1966 the full codon table was complete: 61 sense codons + 3 stop codons + the AUG start.

Francis Crick, Sydney Brenner, Leslie Barnett and Richard Watts-Tobin (1961) used acridine-induced frameshift mutations in T4 bacteriophage to demonstrate that the code is read in non-overlapping triplets. Their experimental logic — that three nearby insertions (or three nearby deletions) restored function while one or two did not — is one of the canonical demonstrations of triplet code.

Crick's "wobble hypothesis" (1966): proposed that the third base of a codon pairs less strictly with its tRNA anticodon, explaining the structural basis of degeneracy. Most synonymous codons differ only in the third position.

Nirenberg, Khorana and Robert Holley shared the 1968 Nobel Prize in Physiology or Medicine for cracking the genetic code. Paraphrase the schools of thought — do not invent quotation.

Synoptic links

This lesson connects forward to:

ocr-alevel-biology-nucleic-acids-enzymes — Transcription (Lesson 6): the mRNA codon table you learn here is read off the template strand of DNA by RNA polymerase.
ocr-alevel-biology-nucleic-acids-enzymes — Translation (Lesson 7): tRNA anticodons read mRNA codons; the genetic code is the rule-book that links the two.
ocr-alevel-biology-genetics-inheritance — Mutations: silent, missense, nonsense and frameshift mutations are defined relative to the code properties. Sickle-cell anaemia (E6V missense in β-globin) and cystic fibrosis (ΔF508 deletion in CFTR) are canonical examples.
ocr-alevel-biology-genetics-inheritance — Genetic engineering: universality is the load-bearing reason why human insulin can be produced in E. coli — the bacterium reads human DNA codons identically.
ocr-alevel-biology-biological-molecules — Protein structure: primary structure (the amino-acid sequence specified by the codons) determines tertiary structure (Anfinsen's principle), which determines function.

Specimen question modelled on the OCR H420 paper format

Question (6 marks): Describe what is meant by the genetic code being triplet, degenerate, non-overlapping and (nearly) universal, and explain why each property is biologically important.

Mark scheme decomposition (AO breakdown):

Lesson 5: The Genetic Code — Triplet, Degenerate, Non-Overlapping, Universal