Genetic Fingerprinting and Forensics

Genetic fingerprinting (also called DNA profiling) is a technique used to identify individuals on the basis of variation at highly polymorphic regions of the genome. Except for monozygotic twins (whose DNA is identical at the level of genome sequence), no two people share the same DNA fingerprint at the resolution of modern STR-based systems. The technique has become indispensable in forensic science, paternity testing, immigration, disaster victim identification, conservation genetics and archaeology. The molecular basis is variation in repeat-DNA copy number; the practical technology depends almost entirely on the PCR and gel electrophoresis methods introduced in Lesson 5.

Spec mapping: This lesson sits in AQA 7402 Section 3.8.4 — Gene technologies allow the study and alteration of gene function. The specification expects candidates to describe the principle of genetic fingerprinting (DNA probes hybridising with VNTR/STR loci, separation by gel electrophoresis, generation of a unique banding pattern) and to evaluate applications and ethical considerations. (Refer to the official AQA specification document for exact wording.)

By the end of this lesson you should be able to: explain why non-coding tandem-repeat loci are so much more variable between individuals than coding sequences; describe the modern multiplex-PCR-plus-capillary-electrophoresis profiling workflow step by step; contrast it with the original Southern-blot VNTR method; calculate a multilocus random-match probability from per-locus frequencies; apply allele inheritance to a paternity problem; and evaluate the ethical, legal and statistical limits of DNA evidence.

Worked Example — Why the Match Probability Is So Small

Students often quote a figure such as "one in a billion" for a DNA match without understanding where it comes from. The number is not a property of any single locus; it is the product of the frequencies at many independent loci, and working one through makes the statistical logic concrete.

The reasoning rests on two population-genetics ideas that are examined synoptically with inheritance and populations. First, at a single locus, if an allele has frequency $p$ in the population, then under Hardy–Weinberg expectations the frequency of a homozygous genotype is $p^{2}$ , and the frequency of a heterozygous genotype carrying two alleles of frequencies $p$ and $q$ is $2pq$ . Second, because the STR loci used in profiling lie on different chromosomes (or far apart on the same one), the genotypes at different loci are inherited essentially independently, so their frequencies multiply — this is the assumption of linkage equilibrium.

Take a simplified profile at four loci, and suppose the individual's genotype frequency works out at each locus as follows: locus one, a heterozygote at frequency $0.10$ ; locus two, a heterozygote at $0.08$ ; locus three, a homozygote at $0.04$ ; locus four, a heterozygote at $0.05$ . The probability that a random unrelated person shares this exact four-locus genotype is the product:

$0.10 \times 0.08 \times 0.04 \times 0.05 = 1.6 \times 10^{-5}$

That is roughly a one-in-sixty-thousand chance — already small, but not yet compelling on its own, because sixty thousand people is fewer than a small city. The power of a real forensic panel comes from adding loci. The UK DNA-17 system multiplies frequencies across seventeen loci; because each locus multiplies the running product by a small fraction, the combined random-match probability typically falls below $10^{-15}$ — far smaller than the entire human population, so that in practice only a monozygotic twin would be expected to share the full profile.

Two cautions belong in any top-band answer that quotes such a figure. First, the multiplication is only valid if the loci really are independent and the population frequencies are correct for the relevant sub-population; close relatives share alleles, so the random-match probability for a sibling is much higher than for an unrelated person. Second, the tiny random-match probability is not the same as the probability that the suspect is innocent: laboratory contamination, sample mix-ups and mixture-interpretation errors occur at rates that can far exceed $10^{-15}$ , so in a well-run case the realistic dominant source of doubt is procedural error, not coincidence. Communicating both numbers honestly is part of the ethical use of DNA evidence.

The Molecular Basis of Genetic Fingerprinting

Variable Regions in the Genome

Although approximately 99.9% of human DNA sequence is identical between unrelated individuals, the remaining 0.1% contains highly variable regions. Much of this variation falls in repetitive DNA — regions where a short sequence of bases is repeated multiple times in tandem (head-to-tail).

These tandem-repeat regions are far more variable between individuals than coding sequences because mutations that add or remove repeat units (by slippage during replication or by unequal crossing over during meiosis) accumulate freely there — they typically have no functional consequence, so selection does not eliminate them. The result is that the number of repeats at any given locus differs widely between individuals.

Short Tandem Repeats (STRs)

Key Definition: Short tandem repeats (STRs) are sequences of 2–6 base pairs that are repeated a variable number of times at specific loci in the genome. The number of repeats at each locus varies between individuals.

Also known as microsatellites.
Example: the STR sequence "AGAT" might be repeated 8 times in one individual, 12 times in another, and 15 times in a third — producing alleles of different lengths (32 bp, 48 bp, 60 bp respectively at this locus).
STRs are the basis of modern DNA profiling systems. The UK DNA-17 system analyses 17 STR loci plus a sex-determining marker (amelogenin, which distinguishes the X and Y chromosomes).
Each person has two alleles at each locus (one inherited from each parent). The number of repeats at each locus varies, so the combined profile across multiple loci is effectively unique to each individual (the probability that two unrelated individuals share the full multilocus profile is less than 1 in 10⁹).

Variable Number Tandem Repeats (VNTRs)

VNTRs are longer repeat sequences (10–60 base pairs per repeat unit), also known as minisatellites.
VNTRs were used in the original DNA fingerprinting technique developed by Sir Alec Jeffreys in 1984 at the University of Leicester. Jeffreys's discovery — that VNTR-rich probes detect a pattern of bands on a Southern blot that is individual-specific — gave birth to the field. (His framework is paraphrased here — no verbatim quotation from his 1984/85 Nature papers is reproduced.)
VNTRs have largely been replaced by STRs in modern forensic analysis because STRs are smaller (typically <500 bp products vs ~1–20 kb for VNTRs) and therefore more amenable to PCR amplification from degraded or trace samples, and to automated capillary electrophoresis.

The Modern DNA Profiling Process (Using STRs)

Step 1 — DNA extraction

DNA is extracted from the biological sample (blood, saliva, semen, hair root, epithelial cells from a fingernail scraping, bone, etc.).
Even very small samples (single cells) and partially degraded samples can yield a profile, because PCR can amplify even single-template molecules. Forensic protocols routinely process samples containing only picograms of DNA.

Step 2 — PCR amplification

Specific STR loci are amplified using PCR (Lesson 5) with primers flanking each STR region.
Modern systems use multiplex PCR, which amplifies multiple STR loci simultaneously in a single reaction — different primer pairs target different loci, all working in parallel under the same temperature programme.
Each primer pair is labelled with a fluorescent dye of a specific colour, so the products from different loci can be distinguished after electrophoresis (a single capillary run can produce data from 17+ loci using 4–5 different dye colours).

Step 3 — Separation by capillary electrophoresis

The amplified STR fragments are separated by capillary electrophoresis — a modern, high-resolution variant of gel electrophoresis.
Fragments migrate through a thin capillary (~50 µm internal diameter, ~50 cm long) filled with a polymer at high voltage; small fragments migrate faster, the same principle as agarose gel electrophoresis but with single-base resolution.
A laser at the end of the capillary excites the fluorescent dyes; a detector reads the fluorescence and records the size (number of repeats) and colour (which locus) of each fragment as it passes.

Step 4 — Generating the DNA profile

The results are displayed as an electropherogram — a graph showing peaks at specific positions corresponding to the size of each STR allele at each locus.
Each locus shows either one peak (if the individual is homozygous — same number of repeats on both chromosomes) or two peaks (if heterozygous — different numbers of repeats on each chromosome).
The pattern of peaks across all loci constitutes the individual's DNA profile.

Mermaid: the modern DNA profiling workflow

flowchart TD
  A["Biological sample (blood, saliva, hair, skin cells)"] --> B["DNA extraction"]
  B --> C["Multiplex PCR: 17 STR loci + amelogenin, primers fluorescently labelled"]
  C --> D["Capillary electrophoresis: fragments separated by size"]
  D --> E["Laser detection of fluorescent peaks"]
  E --> F["Electropherogram: alleles at each STR locus"]
  F --> G["Compare profile with reference / suspect / database / parent"]
  G --> H["Match probability calculated → identity / paternity / exclusion"]

Calculating match probability

If the allele frequencies at each STR locus are known from population databases, the probability of a random unrelated individual sharing the full multilocus profile can be calculated by multiplying the per-locus genotype frequencies (assuming linkage equilibrium and Hardy–Weinberg equilibrium).

With 17 STR loci, typical match probabilities are <10⁻¹⁵ — vanishingly small, far smaller than the world's population. This is the statistical foundation of forensic DNA evidence.

Southern Blotting (the Original Jeffreys Method)

The original DNA fingerprinting method, devised by Alec Jeffreys and his colleagues at Leicester, used Southern blotting to detect VNTRs. Although largely superseded by PCR-based methods, Southern blotting remains an important benchmark for understanding the principles and is still examined.

Procedure

Restriction enzyme digestion: genomic DNA is cut with a restriction enzyme (e.g. HaeIII) that has frequent recognition sites flanking the VNTR repeat blocks but does not cut within them. The length of each fragment containing a VNTR therefore depends on the number of repeat units — an individual with more repeats has a longer fragment.
Gel electrophoresis: the DNA fragments are separated by size on an agarose gel.
Transfer (blotting): the separated fragments are transferred from the gel onto a nylon membrane (or nitrocellulose membrane) by capillary action (Southern's original method, hence the name "Southern blot") or electroblotting. This creates a permanent copy of the fragment pattern on a solid support.
Hybridisation with a probe: the membrane is incubated with a labelled DNA probe — a single-stranded DNA sequence complementary to the repetitive sequence of interest. The probe is labelled with a radioactive isotope (³²P) or a fluorescent marker. The probe binds (hybridises) to complementary fragments on the membrane by base pairing.
Detection: excess probe is washed away. The position of the bound probe is detected by exposing the membrane to X-ray film (autoradiography) or using a fluorescence detector. The result is a pattern of bands — the DNA fingerprint.

Comparison: Southern blotting vs modern STR profiling

Feature	Southern blot (Jeffreys, 1984)	Modern STR / capillary electrophoresis
Repeat type detected	VNTRs (minisatellites, 10-60 bp units)	STRs (microsatellites, 2-6 bp units)
Starting DNA required	Microgram quantities	Picogram quantities (single cells)
Sensitivity to degraded DNA	Poor (large fragments needed)	Good (small fragments amplify easily)
Time to result	1-3 weeks	1-2 days
Automation	Manual	Fully automated, multiplex
Throughput	Few samples per week	Thousands of samples per day
Sensitivity	Limited by autoradiography	Sub-femtomolar by fluorescence

Applications of Genetic Fingerprinting

Forensic identification

DNA profiles from crime scene evidence (blood, semen, saliva, hair root, skin cells under fingernails, "touch DNA" from contact surfaces) are compared with profiles from suspects and with the national DNA database.
In the UK, the National DNA Database (NDNAD) — established 1995 — holds profiles of individuals arrested for recordable offences. It is one of the largest forensic DNA databases in the world. Retention policy has evolved following European Court of Human Rights rulings; profiles of arrested but unconvicted individuals are typically deleted after a defined retention period.
DNA evidence can both incriminate and exonerate suspects. Indeed the first case in which Jeffreys's technique was applied (the murders of Lynda Mann and Dawn Ashworth in Leicestershire) initially excluded a man who had falsely confessed, before subsequently identifying the actual perpetrator.
The Innocence Project (USA) has used post-conviction DNA testing to exonerate large numbers of wrongly convicted individuals — particularly those convicted on the basis of mistaken eyewitness testimony.

Paternity and maternity testing

A child inherits one allele at each STR locus from each parent.
Comparing the child's profile with the alleged parent's profile, paternity or maternity can be confirmed or excluded.
At each locus, one of the child's alleles must match a maternal allele and one must match a paternal allele; an apparent mismatch at multiple loci excludes parentage; a match at all loci tested gives a paternity index quantified against population frequencies.

Worked Example — Paternity Analysis:

At a particular STR locus, a child has alleles with 8 and 12 repeats. The mother has alleles with 8 and 10 repeats. A potential father has alleles with 11 and 12 repeats. Is he likely to be the biological father?

Genetic Fingerprinting and Forensics

Genetic Fingerprinting and Forensics

Worked Example — Why the Match Probability Is So Small

The Molecular Basis of Genetic Fingerprinting

Variable Regions in the Genome

Short Tandem Repeats (STRs)

Variable Number Tandem Repeats (VNTRs)

The Modern DNA Profiling Process (Using STRs)

Step 1 — DNA extraction

Step 2 — PCR amplification

Step 3 — Separation by capillary electrophoresis

Step 4 — Generating the DNA profile

Mermaid: the modern DNA profiling workflow

Calculating match probability

Southern Blotting (the Original Jeffreys Method)

Procedure

Comparison: Southern blotting vs modern STR profiling

Applications of Genetic Fingerprinting

Forensic identification

Paternity and maternity testing

More in Biology