Reliability and Validity

Two of the most important concepts in research methods are reliability and validity. These determine the quality of a study and whether its findings can be trusted.

Reliability

Reliability refers to the consistency of a measurement or finding. A reliable study produces the same results when repeated under the same conditions. If a study is not reliable, its findings cannot be trusted.

Types of Reliability

Type	Description	How to Assess
Test-retest reliability	The same test produces consistent results when given to the same people on different occasions	Give the test twice to the same participants and compare results
Inter-rater reliability	Different observers or raters produce consistent results when measuring the same behaviour	Two or more observers independently record behaviour and their records are compared

Improving Reliability

Standardise procedures — ensure the study is conducted in exactly the same way each time
Operationalise variables clearly — so that measurements are objective and unambiguous
Train observers — in observational studies, ensure all observers use the same criteria
Use structured formats — structured interviews and questionnaires with closed questions are more reliable than unstructured methods

Validity

Validity refers to whether a study measures what it claims to measure and whether the findings are genuine and meaningful. A study can be reliable but not valid (e.g. a bathroom scale that consistently reads 5kg too heavy — it is reliable but not valid).

Types of Validity

Type	Description	Example
Internal validity	The extent to which the study measures what it claims to measure — did the IV really cause the change in the DV?	Were confounding variables controlled?
External validity	The extent to which findings can be generalised beyond the specific study
— Ecological validity	Can findings be generalised to real-life settings?	Lab experiments may lack ecological validity
— Population validity	Can findings be generalised to other groups of people?	Studies using only psychology students may not generalise to the wider population
Face validity	Does the measurement appear to measure what it claims?	Does an intelligence test look like it tests intelligence?

Threats to Validity

Threat	Description
Demand characteristics	Participants guess the purpose of the study and change their behaviour accordingly
Social desirability bias	Participants give answers they think are socially acceptable rather than truthful
Observer bias	The researcher's expectations influence how they interpret behaviour
Confounding variables	Uncontrolled variables that provide alternative explanations for the results
Experimenter effects	The researcher's behaviour unintentionally influences participants

Improving Validity

Control extraneous variables — reduces alternative explanations for results
Use double-blind procedures — neither researcher nor participant knows the condition
Conduct research in natural settings — increases ecological validity
Use diverse samples — increases population validity
Use objective measures — reduces observer bias
Ensure anonymity — reduces social desirability bias

The Relationship Between Reliability and Validity

	Reliable	Not Reliable
Valid	Ideal — consistent AND measures what it claims	Impossible — cannot be valid if not reliable
Not Valid	Possible — consistently wrong	Poor quality research

Exam Tip: Remember that a study must be reliable before it can be valid. A measurement that gives different results each time (unreliable) cannot possibly be measuring what it claims to (valid). However, a study can be reliable but not valid — like the broken scale that is consistently wrong.

quadrantChart
    title Reliability vs Validity
    x-axis Low Reliability --> High Reliability
    y-axis Low Validity --> High Validity
    quadrant-1 Reliable and Valid (ideal)
    quadrant-2 Valid but inconsistent (rare)
    quadrant-3 Neither (poor research)
    quadrant-4 Reliable but inaccurate
    Ideal study: [0.85, 0.85]
    Broken scale: [0.85, 0.2]
    Scattered darts: [0.2, 0.2]
    Biased IQ test: [0.8, 0.25]

Peer Review

Peer review is the process by which research is evaluated by other experts in the same field before it is published in a scientific journal.

The Process

The researcher submits their paper to a journal
The editor sends it to independent experts (peers) for evaluation
The peers assess the research for: methodology, analysis, conclusions, ethical standards
The peers recommend: accept, revise, or reject the paper
The researcher may need to make changes before publication

Why Peer Review Matters

Ensures published research meets high standards of quality
Identifies errors, flaws, or biases in methodology or analysis
Prevents fraudulent or misleading research from being published
Maintains public trust in scientific findings

Limitations of Peer Review

Reviewers may have bias — they may be more favourable to research that supports their own views
The process can be slow — delaying the publication of important findings
Reviewers may not detect all errors or fraud
There is a publication bias — journals tend to publish positive findings, meaning studies with null results may not be published

Key Points

Reliability = consistency of measurement; validity = accuracy of measurement.
Types of reliability: test-retest, inter-rater.
Types of validity: internal, external (ecological and population), face.
Threats to validity include demand characteristics, social desirability, and confounding variables.
A study must be reliable to be valid, but reliability does not guarantee validity.
Peer review ensures research quality but has limitations.

Why Reliability and Validity Matter

Reliability and validity are the twin tests of research quality. A study that is both reliable and valid provides findings that can be trusted and built upon. A study that fails either test provides results that must be treated with caution — and a study that fails both has little value at all.

For this reason, AQA examiners frequently ask students to evaluate studies in terms of reliability and validity. Use the two concepts as a scaffolding for your evaluations: ask whether the methods used would produce consistent results if repeated (reliability) and whether the measurements truly capture what they claim to capture (validity). Identify specific threats (e.g. demand characteristics, observer bias) and explain how the researcher could address them. This structured approach is the signature of top-grade answers.

Reliability and Validity

Reliability and Validity

Reliability

Types of Reliability

Improving Reliability

Validity

Types of Validity

Threats to Validity

Improving Validity

The Relationship Between Reliability and Validity

Peer Review

The Process

Why Peer Review Matters

Limitations of Peer Review

Key Points

Why Reliability and Validity Matter

Reliability vs Validity — An Analogy

More in Psychology