You are viewing a free preview of this lesson.
Subscribe to unlock all 12 lessons in this course and every other course on LearningBro.
The ability to read, compare and analyse DNA has transformed biology, medicine and forensic science. OCR A-Level Biology A specification 6.1.3 (a)–(e) requires you to understand how scientists manipulate genomes — beginning with the techniques used to profile individuals, amplify minute samples, and sequence entire genomes. This opening lesson of Module 6.1.3 sets out the molecular toolkit underlying the whole chapter, from the polymerase chain reaction that copies DNA exponentially to the next-generation sequencers that can read a human genome in under a day.
Key Definitions:
- DNA profiling — the process of producing an image of the patterns in a person's DNA (their "genetic fingerprint").
- VNTR (Variable Number Tandem Repeat) — a non-coding DNA sequence repeated in tandem, with the number of repeats varying between individuals.
- STR (Short Tandem Repeat) — a shorter VNTR (2–6 bp repeat units), used routinely in modern forensic DNA profiling.
- PCR (Polymerase Chain Reaction) — an in vitro technique for amplifying a specific DNA sequence into millions of copies.
- Gel electrophoresis — a technique that separates DNA fragments by size using an electric field.
- Sequencing — determining the order of nucleotide bases in a DNA molecule.
- Bioinformatics — the use of computing to store, retrieve and analyse biological data, particularly genome and protein sequences.
Over 99% of the human genome is identical between individuals, yet the remaining <1% contains enough variation to identify a person uniquely (apart from monozygotic twins). Much of this variation lies in non-coding DNA, particularly in regions of repeated sequences. OCR wants you to understand that these non-coding regions — once dismissed as "junk DNA" — are precisely what make profiling possible, because the number of repeats at each locus varies so widely between people.
DNA profiling is used for:
DNA must first be extracted from a cell sample (blood, saliva, hair root, semen, bone). Cells are lysed with detergent to dissolve membranes, proteins are digested with protease, and DNA is precipitated with cold ethanol. Modern forensic kits automate this using silica columns that bind DNA while contaminants wash through.
Very small samples (a single hair, a speck of dried blood) contain too little DNA to analyse directly. PCR solves this by copying a target sequence exponentially. Each cycle doubles the amount of DNA, so 30 cycles produce over a billion copies from a single molecule.
flowchart TD
A[Sample DNA + primers + nucleotides + Taq polymerase] --> B[Denaturation 95 degrees C]
B --> C[Annealing 55 degrees C primers bind]
C --> D[Extension 72 degrees C Taq synthesises new strand]
D --> E{Repeat 25-35 cycles}
E -->|yes| B
E -->|no| F[Millions of copies]
A PCR reaction contains:
| Stage | Temperature | What happens |
|---|---|---|
| Denaturation | 95 °C | Hydrogen bonds break; DNA strands separate |
| Annealing | 50–65 °C | Primers bind (hybridise) to complementary sequences flanking the target |
| Extension | 72 °C | Taq polymerase extends the primer, synthesising a new strand 5' → 3' |
Exam Tip: OCR often asks why Taq polymerase is used rather than human DNA polymerase. Give two reasons: (1) Taq is not denatured at 95 °C, so it survives repeated heating cycles, and (2) its optimum is about 72 °C, matching the extension temperature.
After 30 cycles, a single starting molecule theoretically becomes 2³⁰ ≈ 10⁹ copies. In practice efficiency is lower, but billions of copies are routine.
Amplified DNA fragments are separated by size using gel electrophoresis. The gel is a mesh of agarose (for large fragments) or polyacrylamide (for finer resolution). DNA is negatively charged because of its phosphate backbone, so when an electric field is applied, fragments migrate towards the anode (positive electrode). Smaller fragments move faster through the gel matrix, so after a set time fragments are separated by size — small at the far end, large near the wells.
The DNA is visualised by staining (e.g. with ethidium bromide or SYBR green) and viewing under UV light, or by using fluorescently labelled primers that appear in different colours.
| Component | Function |
|---|---|
| Agarose gel | Porous matrix that separates fragments by size |
| Buffer (TAE or TBE) | Maintains pH and conducts current |
| Loading dye | Weighs down sample, tracks migration |
| DNA ladder | Fragments of known size for comparison |
| Power supply | Creates the electric field |
Early DNA profiling (developed by Sir Alec Jeffreys at Leicester in 1984) used VNTRs — long tandem repeats cut out by restriction enzymes and separated on a gel, producing a pattern of bands resembling a barcode. Modern forensic profiling uses STRs: shorter repeats (e.g. the sequence GATA repeated 6–15 times). The UK National DNA Database uses a set of 17 STR loci plus amelogenin (for sex determination). The chance that two unrelated people share an identical profile at all 17 loci is less than 1 in a billion.
STRs are amplified by PCR using fluorescent primers, and the products are analysed by capillary electrophoresis — a high-resolution form of gel electrophoresis. Each locus produces one or two peaks (homozygous or heterozygous) on a readout called an electropherogram.
Sequencing determines the order of bases in a DNA molecule. The original technique, developed by Fred Sanger in 1977, is still used for short reads (up to about 900 bp).
Principle: a mixture of normal dNTPs and a small proportion of fluorescently labelled ddNTPs (dideoxynucleotides) is added to a PCR-like reaction. Whenever a ddNTP is incorporated, chain extension stops because it lacks the 3' OH needed for the next bond. Over millions of molecules, every possible stopping point is represented by fragments of different lengths, each labelled by colour according to the terminating base.
The fragments are separated by capillary electrophoresis and a laser reads the colour of each peak as it passes. The resulting chromatogram gives the sequence directly.
Sanger sequencing is slow and expensive at the genome scale. Next-generation sequencing — also called high-throughput sequencing — reads millions of short fragments in parallel. A typical Illumina run produces hundreds of billions of bases in a single day. The Human Genome Project took 13 years and $3 billion (1990–2003) using Sanger methods; today an entire human genome can be sequenced for under £500.
Key advantages of NGS over Sanger:
The flood of sequence data from NGS could not be analysed without bioinformatics — the computational handling of biological data. OCR specifically asks you to understand why bioinformatics is needed.
Bioinformatics is used to:
Genomics is the study of whole genomes; proteomics is the study of the complete set of proteins (the proteome). Because gene expression varies between tissues and over time, the proteome is much more complex than the genome. Understanding the proteome is the next frontier of personalised medicine.
At a single STR locus, a mother has alleles 8, 10 (meaning 8 and 10 repeats), a child has alleles 10, 13, and the alleged father has alleles 13, 14. Could he be the father?
The child inherited allele 10 from the mother (consistent with her 8, 10 genotype) and allele 13 from the father. The alleged father has allele 13, so he is consistent with paternity at this locus. In a real test, 15+ loci would be examined; a man is excluded if he lacks an obligate paternal allele at any one locus, and confirmed if he matches at all of them.
Reference: OCR A-Level Biology A (H420) specification 6.1.3 (a)–(e).