You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
A gene is, in its everyday sense, a unit of inheritance. At the molecular level the definition is more demanding: a gene is a length of DNA that specifies a functional product — typically a polypeptide, occasionally a functional non-coding RNA. The translation of that nucleotide sequence into amino-acid sequence relies on the genetic code: a near-universal cipher whose properties (triplet, degenerate, non-overlapping) shape both how protein synthesis works and how mutations propagate. This lesson dissects the molecular definition of a gene, the architecture of the genetic code, the structure of eukaryotic genes (introns and exons), and the basic logic of gene expression and regulation.
Spec mapping: This lesson spans AQA 7402 Sections 3.4.1 and 3.4.2 — the structure of DNA / genes, and the relationship between genes and protein synthesis. It anchors the conceptual framework that the transcription and translation lessons (3.4.2) develop mechanistically. (Refer to the official AQA specification document for exact wording.)
Key Definition: A gene is a sequence of DNA nucleotides that codes for a functional polypeptide or a functional RNA molecule (such as transfer RNA, ribosomal RNA, or a regulatory non-coding RNA).
Three clarifications are essential at A-Level:
The definition of "gene" has changed substantially over 150 years of biological research. Each refinement reflected new experimental capabilities and a deepening of the molecular framework:
Paraphrasing the historical claims is appropriate — never quote verbatim — but the lineage matters because exam questions on "evidence that DNA is the genetic material" expect Avery–MacLeod–McCarty and Hershey–Chase by name.
The sequence of bases in DNA is read in groups of three called triplets (when referring to the DNA sequence) or codons (when referring to the mRNA sequence). Each codon specifies one amino acid, or one of three stop signals.
There are only four bases in DNA, but the genetic code must specify 20 standard amino acids and a stop signal.
This argument was made by George Gamow in 1954 before the code was experimentally cracked. The actual decoding was completed by Marshall Nirenberg, Har Gobind Khorana and Robert Holley in the 1960s using synthetic mRNAs of known sequence; they shared the 1968 Nobel Prize.
| Property | Meaning | Significance |
|---|---|---|
| Triplet | Three bases code for one amino acid | 4³ = 64 codons, sufficient for 20 amino acids + stop signals |
| Degenerate (redundant) | Most amino acids are coded for by more than one codon | A single-base change at the wobble position often leaves the encoded amino acid unchanged — silent mutations |
| Non-overlapping | Each base is part of only one codon | A change in one base affects only one codon, not several adjacent codons |
| Has a defined reading frame | Reading must begin at a specific start codon | The codon boundaries are fixed by the start codon; insertions or deletions shift the frame |
| Universal (with minor exceptions) | The same codons specify the same amino acids in nearly all organisms | Evidence for common ancestry; underpins genetic engineering (human genes can be expressed in E. coli) |
| Has start and stop signals | AUG starts translation; UAA, UAG, UGA terminate it | Defines the limits of the translated region |
Exam Tip: The universality of the genetic code is a key piece of evidence for evolution from a common ancestor — every organism on Earth, from archaea to oak trees to humans, decodes the same mRNA codons into the same amino acids (with rare exceptions in mitochondrial genomes and ciliates). This universality is also what allows a human gene placed into E. coli to produce human insulin.
| Codon | Amino acid | Notes |
|---|---|---|
| AUG | Methionine | Start codon — establishes reading frame |
| UAA, UAG, UGA | (none) | Stop codons — release factor binds the A site |
| UUU, UUC | Phenylalanine | First codon ever decoded (Nirenberg, 1961 — poly-U mRNA gave poly-Phe) |
| GCU, GCC, GCA, GCG | Alanine | Fourfold-degenerate wobble — third base can be anything |
| UGG | Tryptophan | The only "one-codon" amino acid (no synonyms) |
| AUG | Methionine | Also internal Met — the codon does double duty |
The same mRNA sequence can in principle be read in three different reading frames depending on where reading begins. For example, the sequence AUGCGAUUC can be parsed as:
The start codon (AUG) sets the correct reading frame. The ribosome scans the mRNA from the 5′ cap looking for the first AUG in a favourable context (Kozak consensus); once found, translation locks in and the codons are read in non-overlapping triplets thereafter. An insertion or deletion mutation shifts the reading frame downstream of the lesion — a frameshift mutation — and typically generates a nonsensical amino-acid sequence followed by an early stop codon.
Francis Crick proposed that the third base of a codon (the 3′ position) makes a less stringent geometric pairing with the corresponding first base of the tRNA anticodon (the 5′ position of the anticodon). This "wobble" allows a single tRNA to recognise multiple codons differing only in the third base — pairing rules such as G:U and inosine:U/C/A become permissible at this position.
Eukaryotic genes are split into alternating coding and non-coding regions:
flowchart TD
A["Eukaryotic gene (DNA)"] --> B["Exon 1 — Intron 1 — Exon 2 — Intron 2 — Exon 3"]
B --> C["Transcription"]
C --> D["Pre-mRNA: all exons + all introns"]
D --> E["Splicing (spliceosome): introns excised"]
E --> F["Mature mRNA: Exon 1 — Exon 2 — Exon 3"]
F --> G["Translation → polypeptide"]
Key features:
Key Definition: Alternative splicing is the process by which different combinations of exons from the same gene are joined during mRNA processing, producing different mRNA molecules and therefore different polypeptides from a single gene.
The human tropomyosin gene contains 11 exons. Different tissue types express different splice variants of the same pre-mRNA:
A single gene therefore yields tissue-specific protein products. This expands the proteome (~100,000 proteins) well beyond the gene count (~20,000), explaining how complex organisms can be built from a relatively modest number of genes. About 95% of human multi-exon genes undergo some form of alternative splicing, and many of the resulting variants differ functionally in ways that contribute to tissue specialisation.
A gene with five exons and three internal exons that can be independently included or excluded has 2³ = 8 possible mature mRNA variants. Genes with mutually exclusive exon clusters (like Drosophila Dscam) can theoretically generate tens of thousands of variants from a single locus. Alternative splicing therefore acts as a "combinatorial amplifier" on the gene count — a major reason that the modest 20,000-gene human genome can give rise to such phenotypic complexity.
The Human Genome Project (completed 2003) and subsequent annotation revealed:
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.