You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
α-Amino acids are the twenty molecular monomers from which every protein in every living organism is constructed — a chemical alphabet that, in different sequences, encodes the catalytic machinery of enzymes, the structural fabric of collagen and keratin, the oxygen-carrying capacity of haemoglobin, and the immune recognition of antibodies. In this lesson we develop the chemistry that underpins biology. We first establish the general α-amino acid structure H₂N–CH(R)–COOH and the consequence of carrying both an acidic and a basic group: the zwitterion. We then quantify acid-base behaviour through the isoelectric point pI, examine peptide-bond formation as a condensation reaction whose product is the amide linkage already familiar from lesson 2, and build the hierarchy of protein structure — primary, secondary, tertiary, quaternary — from sequence through hydrogen-bonded helices and sheets to multi-subunit assemblies. We close with the molecular structure of DNA: a pentose-phosphate backbone, four bases, and the Watson-Crick complementary base-pairing scheme (1953) that allows the molecule to copy itself.
Spec mapping (AQA 7405): This lesson maps to §3.3.13 (amino acids, proteins and DNA). It draws explicitly on §3.3.11 (lesson 5 of this course, amines) for the nucleophilic and basic chemistry of the –NH₂ group, on §3.3.9 (lesson 2 of this course, amides and acyl chlorides) for the chemistry of the peptide bond, and on §3.3.14 (lesson 7, condensation polymers) — since proteins are biopolyamides built by exactly the same condensation chemistry as nylon. Secondary and tertiary protein structure rely on hydrogen bonding, the principles of which are developed in §3.1.3 lesson 4 of the AQA AS bonding course. Refer to the official AQA specification document for the exact wording of each section.
Assessment objectives: AO1 recall items include the general α-amino acid structure, the definitions of zwitterion and isoelectric point, the four levels of protein structure, and the Watson-Crick complementary base-pairing scheme A=T and G≡C. AO2 application questions require students to draw the zwitterion form at a stated pH, to identify the peptide bond in a given dipeptide or tripeptide, and to assign the complementary base sequence of a short DNA strand. AO3 reasoning questions ask students to predict protein folding from a tabulated set of R-group properties, to explain why DNA forms a stable double helix in terms of the number and geometry of inter-strand hydrogen bonds, and to rationalise the difference in pI between an acidic, neutral, and basic amino acid using the side-chain pKa values.
A 2-amino acid (or α-amino acid) has the general structure H₂N–CH(R)–COOH: a central carbon (the α-carbon) bonded to a primary amine –NH₂, a carboxylic acid –COOH, a hydrogen atom, and a variable side chain R that defines the identity of the amino acid. Twenty α-amino acids are encoded by the standard genetic code; all share the same α-carbon framework and differ only in R.
The α-carbon carries four different substituents in nineteen of the twenty proteinogenic amino acids, making it a stereocentre. The exception is glycine (R = H), which is achiral. The remaining nineteen amino acids exist as enantiomers L and D; proteins are built exclusively from the L-enantiomer. In CIP nomenclature, L-amino acids correspond to (S)-configuration for eighteen of the nineteen; L-cysteine is formally (R) because the sulfur side-chain alters priority order.
| Side chain class | Examples (3-letter code) | R-group character |
|---|---|---|
| Aliphatic, non-polar | Gly, Ala, Val, Leu, Ile | Hydrophobic |
| Aromatic | Phe, Tyr, Trp | Hydrophobic (Tyr/Trp can H-bond) |
| Polar uncharged | Ser, Thr, Asn, Gln | H-bond donor/acceptor |
| Acidic (negative at pH 7) | Asp, Glu | –COOH side chain |
| Basic (positive at pH 7) | Lys, Arg, His | –NH₂ or guanidinium side chain |
| Sulfur-containing | Cys, Met | Cys forms disulfide bridges |
AQA do not require recall of all twenty structures, but students should be able to interpret data tables and recognise the side-chain category from a given structure.
The amine group is basic (its conjugate acid –NH₃⁺ has pKa ≈ 10-11) and the carboxylic acid group is acidic (pKa ≈ 2-3, more acidic than a typical aliphatic carboxylic acid because the protonated amine cation withdraws electron density inductively). At any pH between roughly 3 and 9 — including all biologically relevant pH values — an internal proton transfer occurs:
H₂N–CH(R)–COOH → ⁺H₃N–CH(R)–COO⁻
The product is a zwitterion: a dipolar ion with both a positive and a negative formal charge but a net charge of zero. The zwitterion is the dominant species in water and in cellular fluids. Amino acids exist as zwitterions in the solid state too, which is why crystalline amino acids have surprisingly high melting points (200-300 °C with decomposition) for their modest molecular mass: the solid is an ionic lattice held together by electrostatic attraction between –NH₃⁺ and –COO⁻ centres. Amino acids are also highly soluble in water and almost insoluble in non-polar solvents — both consequences of the zwitterion.
Adding strong acid (lowering pH) protonates the carboxylate; adding strong base (raising pH) deprotonates the ammonium. For a simple neutral amino acid such as alanine:
| pH region | Dominant species | Net charge |
|---|---|---|
| Very low (pH < 2) | ⁺H₃N–CH(R)–COOH | +1 |
| Intermediate (~pH 5-6) | ⁺H₃N–CH(R)–COO⁻ (zwitterion) | 0 |
| Very high (pH > 10) | H₂N–CH(R)–COO⁻ | −1 |
The two pKa values bracket the zwitterion region. For alanine, pKa1 = 2.34 (–COOH ⇌ –COO⁻) and pKa2 = 9.69 (–NH₃⁺ ⇌ –NH₂).
The isoelectric point pI is the pH at which the net charge of the amino acid is zero — that is, the pH at which the concentration of the cationic form equals the concentration of the anionic form and the zwitterion is at maximum concentration.
For a neutral amino acid (no ionisable side chain — Gly, Ala, Val, Leu, Ile, Phe, etc.) the isoelectric point is simply the arithmetic mean of the two backbone pKa values:
pI = (pKa1 + pKa2) / 2
Glycine has pKa1 = 2.34 (–COOH) and pKa2 = 9.60 (–NH₃⁺).
pI = (2.34 + 9.60) / 2 = 5.97
At pH 5.97, glycine exists almost entirely as the zwitterion. Below pH 5.97 the cation begins to dominate; above pH 5.97 the anion begins to dominate.
For amino acids with an ionisable side chain, three pKa values exist and pI is the mean of the two pKa values flanking the zwitterion form.
The pattern: acidic amino acids have low pI (around 3); neutral amino acids have pI around 5-6; basic amino acids have high pI (around 10). This is the foundation of separating proteins by isoelectric focusing (a technique used in proteomics, beyond the A-Level syllabus but a natural extension).
A titration of a neutral amino acid (such as glycine) starts at low pH with the fully protonated cation and ends at high pH with the fully deprotonated anion. The curve shows two buffering regions — one around pKa1 (where –COOH is half-deprotonated) and one around pKa2 (where –NH₃⁺ is half-deprotonated). The midpoint of the central plateau between the two buffering regions is the isoelectric point: the pH at which net charge is zero.
Two amino acids condense by losing one water molecule between the –COOH of the first and the –NH₂ of the second:
H₂N–CH(R₁)–COOH + H₂N–CH(R₂)–COOH → H₂N–CH(R₁)–CO–NH–CH(R₂)–COOH + H₂O
The new –CO–NH– linkage is the peptide bond. Chemically, it is identical to the amide bond developed in lesson 2 of this course: a carbonyl carbon bonded to a nitrogen, with partial double-bond character distributed across the C–N bond through delocalisation of the nitrogen lone pair into the C=O π-system. The peptide bond is therefore planar (six atoms — Cα, C, O, N, H, Cα — lie in one plane) and rotation about the C–N axis is hindered. This planarity is essential for the regular geometry of secondary structures.
A chain of two amino acids is a dipeptide; three is a tripeptide; up to ~50 is an oligopeptide; longer than ~50 is a polypeptide or protein. Peptides are written from the N-terminus (free –NH₂, left) to the C-terminus (free –COOH, right). Order matters: Ala–Gly and Gly–Ala are different molecules.
Practical note: Direct condensation of free amino acids in solution gives a mixture of all possible dipeptides plus higher oligomers and is useless synthetically. Real peptide synthesis uses protecting groups (Merrifield's solid-phase method, beyond A-Level). In cells, the ribosome assembles peptides with absolute sequence fidelity using messenger RNA as the template.
Peptide bonds are kinetically stable in neutral solution at room temperature (the half-life is centuries), but hydrolyse under harsh acid or base. The standard procedure to break a protein down to its constituent amino acids is:
This is the inverse of peptide-bond formation: water is added across the –CO–NH– bond, regenerating the –COOH of one residue and the –NH₂ of the next. The end products are a mixture of free amino acids — though tryptophan is destroyed under these conditions and asparagine/glutamine are converted to their parent acids (Asp/Glu). The amino acid mixture can then be identified by paper chromatography or thin-layer chromatography (developed with ninhydrin), or by electrophoresis at a chosen pH.
The primary structure of a protein is the sequence of amino acid residues along the polypeptide chain, read from the N-terminus to the C-terminus. The sequence is genetically encoded: a triplet of DNA bases (a codon) specifies each amino acid, and the ribosome translates the messenger RNA codon-by-codon to assemble the chain. Primary structure determines everything else — secondary, tertiary, and quaternary structure all fold spontaneously from the primary sequence under physiological conditions. The classic demonstration was Christian Anfinsen's experiment on ribonuclease (Nobel 1972), where denatured enzyme refolded to its native shape simply on removal of the denaturant.
Secondary structure describes the local folding of the polypeptide backbone into regular, repeating geometries stabilised by hydrogen bonds between backbone –C=O and –N–H groups (not between R-groups — that's tertiary). Two motifs dominate:
Both motifs are stabilised by many weak hydrogen bonds acting cooperatively: any one hydrogen bond is worth only ~20 kJ mol⁻¹, but a typical α-helix of 20 residues has 16 such bonds, and the cumulative effect is substantial. The principle — that secondary structure arises from regular, repetitive backbone hydrogen bonding — links directly to the broader treatment of hydrogen bonding in §3.1.3 lesson 4 of the AQA bonding course.
Tertiary structure is the overall three-dimensional folded shape of a single polypeptide chain. Unlike secondary structure (which involves backbone-to-backbone hydrogen bonds), tertiary structure is determined by interactions between R-groups that may be far apart in the primary sequence. Four kinds of R-group interaction stabilise the tertiary fold:
Tertiary structure is destroyed by denaturation: heat (breaks H-bonds and disrupts hydrophobic clustering), strong acid or base (protonates/deprotonates side chains and breaks salt bridges), heavy metals (precipitate –SH groups), reducing agents (cleave disulfide bridges), and detergents (disrupt hydrophobic packing). Once denatured, most proteins lose their biological function.
Quaternary structure is the assembly of two or more polypeptide chains (subunits) into a functional multi-subunit protein. The subunits are held together by the same kinds of R-group interactions that stabilise tertiary structure — hydrogen bonds, ionic interactions, hydrophobic contacts, and occasionally disulfide bridges between chains. The classic example is haemoglobin: a tetramer of two α-chains and two β-chains, each carrying a haem prosthetic group and binding one O₂. Cooperative binding between the four subunits gives haemoglobin its characteristic sigmoidal oxygen-binding curve, central to oxygen delivery from lungs to tissues. Insulin, antibodies (IgG: two heavy and two light chains held together by disulfide bridges), and the photosynthetic reaction centre are all quaternary-structure assemblies.
Not all proteins have quaternary structure — many enzymes are monomeric (a single chain) and require no subunit assembly. Quaternary structure is therefore an optional level, present only in oligomeric proteins.
Electrophoresis separates charged molecules by their migration in an electric field. A small spot of an amino acid mixture is applied to a buffered gel (typically agarose or polyacrylamide) saturated with buffer at a chosen pH. An electric field is applied across the gel:
If a buffer at pH 6 is used on a mixture of Asp (pI 2.77), Gly (pI 5.97) and Lys (pI 9.74), then at pH 6:
Spots are visualised by spraying with ninhydrin and heating: primary amines react with ninhydrin to give a purple-violet condensation product (Ruhemann's purple). Proline, with its secondary amine, gives a yellow product instead.
Practical-skills box — paper chromatography of amino acids. Spot the amino acid mixture and reference samples onto chromatography paper near the bottom edge. Develop in a polar solvent (butan-1-ol / ethanoic acid / water, typically 4:1:5 by volume) until the solvent front nears the top. Dry, spray with 0.2% ninhydrin in propan-2-ol, and warm in an oven at 100 °C for 5 minutes. Purple-violet spots appear at characteristic Rf values that can be compared with reference standards. The technique is mentioned by AQA as an example of a chromatographic separation; quantitative analysis of Rf values is more often examined in the spectroscopy lesson.
Deoxyribonucleic acid (DNA) is the polymeric biomolecule that stores the genetic information of every cell. Its structure was deduced in 1953 by James Watson and Francis Crick at Cambridge, building on the X-ray fibre-diffraction patterns of Rosalind Franklin and Maurice Wilkins at King's College London. AQA require only that the names be recognised — not biographical detail — but the model is one of the great triumphs of structural chemistry.
A nucleotide has three parts:
The four bases are: adenine (A) and guanine (G) — two-ring purines — and thymine (T) and cytosine (C) — single-ring pyrimidines. (RNA, beyond this lesson, uses uracil in place of thymine.)
Nucleotides polymerise by phosphodiester bonds between the 3'-OH of one sugar and the 5'-phosphate of the next. The polymer therefore has a directional sugar-phosphate backbone with a 5'-end (free phosphate) and a 3'-end (free hydroxyl). The bases project sideways from the backbone.
Two antiparallel strands wind around a common axis to form a right-handed double helix with one full turn every ~3.4 nm and ten base pairs per turn. The backbones run on the outside; the bases stack on the inside. Antiparallel means one strand runs 5' → 3' while the partner runs 3' → 5'.
The two strands are held together by complementary base pairing:
| Base pair | H-bonds | Geometry |
|---|---|---|
| A = T | 2 hydrogen bonds | A donates one N–H to T's C=O; T donates one N–H to A's N |
| G ≡ C | 3 hydrogen bonds | G donates two N–H bonds (to C's C=O and C's N) and accepts one from C's N–H |
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.