Research Methods: Experimental Design

Research methods form the foundation of psychology as a science. To test hypotheses, establish cause-and-effect relationships, and generate reliable findings, psychologists must design studies with great care. The experimental method is the most powerful tool for establishing causation, because it is the only method in which the researcher actively manipulates one variable and measures its effect on another while controlling everything else. If a difference in the dependent variable appears when nothing but the independent variable has been changed, we can reasonably infer that the independent variable caused it. This is why the experiment occupies such a central place in the discipline: every other research method described in the next lesson can reveal relationships, but only the experiment can establish causes, and it is causal claims that allow psychology to test theories and underpin interventions such as therapies and educational programmes.

Key Definition: An experiment is a research method in which the researcher manipulates an independent variable (IV), measures a dependent variable (DV), and controls extraneous variables in order to establish a cause-and-effect relationship.

Spec Mapping

This lesson addresses the following points in AQA A-Level Psychology (7182), Section 4.2 (Research methods), assessed across all three papers:

Experimental method: laboratory, field, natural and quasi-experiments.
Aims; stating hypotheses (directional and non-directional); the null hypothesis.
Variables: manipulation and control of variables, the IV and DV, extraneous and confounding variables, operationalisation.
Experimental designs: independent groups, repeated measures and matched pairs.
Control of variables: randomisation, standardisation, counterbalancing, randomisation, demand characteristics and investigator effects.
Pilot studies and their aims.

Assessment objectives engaged: AO1 (definitions and features of each experiment type, design and control), AO2 (applying these to novel research scenarios — identifying the IV/DV, naming a design, writing an operationalised hypothesis), and AO3 (evaluating the strengths and weaknesses of each type and design). Research-methods questions are typically AO2/AO3-heavy, requiring you to apply and analyse rather than merely describe.

Key Terms in Experimental Design

Term	Definition
Independent variable (IV)	The variable deliberately manipulated by the researcher to observe its effect
Dependent variable (DV)	The variable measured by the researcher to determine the effect of the IV
Extraneous variable (EV)	Any variable, other than the IV, that could affect the DV if not controlled (a "nuisance" variable)
Confounding variable	An extraneous variable that has actually varied systematically with the IV, so it is impossible to know which one caused the change in the DV
Hypothesis	A testable, precise prediction about the relationship between the IV and DV
Operationalisation	Defining variables precisely in terms of how they will be manipulated or measured

Why the Experiment Establishes Cause and Effect

The unique power of the experiment comes from the combination of manipulation, measurement and control. By deliberately changing only the IV while holding everything else constant, the researcher creates a situation in which, logically, any resulting change in the DV can have only one source — the IV. This is what no other method achieves: an observation or correlation can show that two things go together, but only an experiment, by actively intervening, can show that one produces the other. The price of this causal power is artificiality, because the control required often means studying behaviour in conditions unlike everyday life — the central trade-off discussed in the evaluation below.

Operationalisation

To operationalise a variable is to state exactly how it will be manipulated (for the IV) or measured (for the DV). "Memory" is not measurable as it stands; "number of words correctly recalled from a 20-word list after a two-minute delay" is. Operationalisation is essential for replication: another researcher can only repeat a study, and check its findings, if the variables are defined in concrete, measurable terms. It also reduces researcher subjectivity, because a precisely defined measure leaves less room for the experimenter's judgement to creep in. A poorly operationalised variable threatens validity (we may not be measuring what we think) and reliability (different researchers may measure it differently), which is why examiners place such weight on it.

Aims and Hypotheses

An aim is a general statement of the purpose of the study (e.g. "to investigate the effect of sleep deprivation on memory").
A directional (one-tailed) hypothesis predicts the direction of the effect (e.g. "Participants who sleep for 8 hours will recall significantly more words than those who sleep for 4 hours"). Used when previous research suggests a particular direction.
A non-directional (two-tailed) hypothesis predicts that there will be a difference but does not specify the direction (e.g. "There will be a significant difference in word recall between participants who sleep for 8 hours and those who sleep for 4 hours"). Used when there is no prior research or it is contradictory.
The null hypothesis states there is no significant difference (or relationship); any difference observed is due to chance. The null is the hypothesis we statistically test and either reject or retain.

It can seem odd that we test the null rather than the experimental hypothesis directly, but there is a logic to it. We can never prove an experimental hypothesis true — there might always be an exception we have not yet seen — but we can gather enough evidence to make the "no effect" explanation implausible. So the strategy is to assume the null (no effect) and ask: if that were true, how likely is the result we obtained? If that probability is very low (at or below the significance level), we reject the null and, by elimination, accept the experimental hypothesis. The directional or non-directional hypothesis is sometimes called the alternative hypothesis for this reason — it is the alternative to the null.

graph TD
    A[Does previous research<br/>predict a direction?] -->|Yes| B[Directional<br/>one-tailed hypothesis]
    A -->|No / contradictory| C[Non-directional<br/>two-tailed hypothesis]
    B --> D[Plus a NULL hypothesis<br/>no difference / due to chance]
    C --> D

Worked example: turning an aim into a hypothesis. Suppose the aim is "to investigate whether caffeine affects reaction time".

Vague (no marks): "Caffeine affects reaction time." — neither variable is operationalised.
Operationalised non-directional: "There will be a significant difference in mean reaction time (ms) on a simple visual reaction-time task between participants who consume 200 mg of caffeine and those who consume a placebo."
Operationalised directional: "Participants who consume 200 mg of caffeine will have a significantly faster mean reaction time (ms) on a simple visual reaction-time task than those who consume a placebo."
Null: "There will be no significant difference in mean reaction time (ms) on a simple visual reaction-time task between participants who consume 200 mg of caffeine and those who consume a placebo; any difference is due to chance."

Notice that each acceptable version specifies how the IV is manipulated (200 mg caffeine vs placebo) and how the DV is measured (mean reaction time in ms on a stated task), and that only the directional version contains a direction word ("faster").

Exam Tip: Always operationalise both the IV and the DV inside the hypothesis. A vague prediction such as "sleep affects memory" gains no marks. State the precise conditions of the IV and the exact measure of the DV. If asked for a directional hypothesis, you must include a direction word such as more, fewer, faster or higher; the null hypothesis should always include the idea of "no significant difference" and "due to chance".

Types of Experiment

The four types differ along two dimensions: where the study takes place (setting) and, crucially, who controls the IV (the researcher, or nature).

Laboratory Experiment

Feature	Description
Setting	Controlled laboratory environment
IV	Manipulated by the researcher
Control	High — extraneous variables can be tightly controlled

Strengths: High internal validity, because tight control means changes in the DV can confidently be attributed to the IV; highly replicable, because standardised procedures can be repeated to test reliability; precise, calibrated measurement.

Limitations: Low ecological validity — the artificial setting and tasks may not reflect everyday behaviour; demand characteristics — participants who know they are being studied may guess the aim and alter their behaviour; investigator effects — the researcher's expectations or manner may unconsciously bias responses.

Field Experiment

Feature	Description
Setting	Natural, real-world environment
IV	Still manipulated by the researcher
Control	Lower than lab — harder to control extraneous variables

Strengths: Higher ecological validity — behaviour is observed in its natural context; reduced demand characteristics, because participants are often unaware they are in a study.

Limitations: Lower internal validity — extraneous variables are harder to control; ethical concerns — covert manipulation makes informed consent impossible; harder to replicate precisely.

Natural Experiment

Feature	Description
Setting	Varies — lab or field
IV	NOT manipulated by the researcher — it occurs naturally (e.g. a natural disaster, a policy change, institutionalisation)
Control	Variable

Strengths: Allows the study of variables that cannot ethically or practically be manipulated (e.g. the effect of institutional deprivation on children — Rutter et al., 1998); high ecological validity in real-world settings.

Limitations: The researcher cannot infer causation with confidence, because the IV was not directly manipulated and confounds abound; random allocation to conditions is impossible; difficult to replicate because the natural event cannot be reproduced on demand.

Quasi-Experiment

Feature	Description
Setting	Varies
IV	A pre-existing characteristic of the participants (e.g. age, gender, a clinical diagnosis) — it cannot be assigned
Control	Variable

Strengths: Allows comparison of groups that differ on an inherent characteristic (e.g. people with vs without a schizophrenia diagnosis); often otherwise well-controlled.

Limitations: No random allocation, because participants already belong to their groups; cannot establish causation, as group differences may be due to confounds linked to the characteristic; participant variables are built in and cannot be eliminated.

Exam Tip: To classify an experiment in a scenario, ask "who controls the IV?" Researcher manipulates it in a lab → laboratory; researcher manipulates it in a natural setting → field; the IV is a naturally occurring event the researcher merely takes advantage of → natural; the IV is a pre-existing participant characteristic → quasi. Natural and quasi experiments cannot demonstrate cause and effect because there is no manipulation and no random allocation.

Worked Examples: Classifying Experiments and Identifying Variables

Applying the "who controls the IV?" question to familiar studies makes the distinctions concrete:

Study scenario	Type	IV	DV
Participants given lists of words to learn under loud vs quiet conditions in a soundproofed lab	Laboratory	Noise level (loud/quiet)	Number of words recalled
Researchers vary how a confederate is dressed (smart/scruffy) while asking strangers for directions in a street	Field	Confederate's dress	Whether help is given
Comparing the development of children adopted before vs after the closure of Romanian orphanages	Natural	Age at adoption (naturally occurring)	Developmental/IQ scores
Comparing memory test scores of participants with vs without a dyslexia diagnosis	Quasi	Presence of dyslexia (pre-existing)	Memory test score

Notice that in the first two cases the researcher manipulates the IV, so causation can in principle be inferred; in the last two the IV is not manipulated, so only an association can be claimed. A common exam task is to take a scenario like these, identify and operationalise the IV and DV, and name the experiment type with justification — always tie your justification to who controls the IV and where the study takes place.

Experimental Designs

The experimental design determines how participants are allocated to the conditions of the IV.

Independent Groups Design (IGD)

Different participants are used in each condition.

Strengths	Limitations
No order effects — each participant does only one condition	Participant variables — individual differences between the groups may confound results
Demand characteristics reduced — participants see only one condition	Needs roughly twice as many participants for the same power

Control technique — random allocation: every participant has an equal chance of being placed in each condition (e.g. drawing names, using a random number generator). This does not eliminate participant variables but should distribute them evenly between conditions. With $n$ participants split equally between two conditions, the number of possible allocations is the binomial coefficient

$\binom{n}{n/2} = \frac{n!}{\left(\tfrac{n}{2}\right)!\,\left(\tfrac{n}{2}\right)!}$

so even a modest group yields a very large number of possible random splits, which is why random allocation is the standard defence against systematic bias.

Repeated Measures Design (RMD)

The same participants take part in all conditions.

Strengths	Limitations
Eliminates participant variables — each person acts as their own control	Order effects — performance may improve (practice) or worsen (fatigue/boredom) across conditions
Needs fewer participants	Demand characteristics — doing both conditions makes the aim easier to guess

Control technique — counterbalancing: half the participants complete condition A then B, the other half B then A (an "ABBA" arrangement), so order effects are balanced across conditions rather than acting on one condition only. For an experiment with $k$ conditions, the number of possible orders is

$k! = k \times (k-1) \times (k-2) \times \cdots \times 2 \times 1$

so two conditions have $2! = 2$ orders, three conditions have $3! = 6$ orders, and four conditions have $4! = 24$ — which is why full counterbalancing becomes impractical as the number of conditions grows.

Order effects in detail. When the same participants do every condition, doing one condition can change performance on the next, in two opposing ways. A practice effect improves later performance (participants become familiar with the task, more relaxed, or better at it), while a fatigue (or boredom) effect worsens it (participants tire or lose motivation). Either way, the order in which conditions are completed becomes confounded with the IV. Counterbalancing does not remove order effects — they still occur — but it balances them so that any practice or fatigue advantage is shared equally across both conditions, cancelling out at the group level. Where even this is unsatisfactory (for example, if simply repeating the task is impossible), a matched pairs or independent groups design avoids order effects altogether.

Matched Pairs Design (MPD)

Different participants are used in each condition, but they are paired on key variables (e.g. age, IQ, sex) likely to affect the DV.

Strengths	Limitations
Reduces participant variables with no order effects	Time-consuming and costly — participants must be pre-tested and matched
No order effects — each person does one condition	Impossible to match on every variable — unmatched differences may still confound
Fewer demand characteristics than RMD	Needs a large pool of participants to find good matches

Research Methods: Experimental Design

Research Methods: Experimental Design

Spec Mapping

Key Terms in Experimental Design

Why the Experiment Establishes Cause and Effect

Operationalisation

Aims and Hypotheses

Types of Experiment

Laboratory Experiment

Field Experiment

Natural Experiment

Quasi-Experiment

Worked Examples: Classifying Experiments and Identifying Variables

Experimental Designs

Independent Groups Design (IGD)

Repeated Measures Design (RMD)

Matched Pairs Design (MPD)

More in Psychology