You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Not all research in psychology uses the experimental method. Many important questions cannot be answered through experiments — either because the variables cannot ethically or practically be manipulated, because the behaviour of interest occurs in natural settings, or because the researcher wants depth and meaning rather than a single measured outcome. Non-experimental approaches include observations, self-report methods (questionnaires and interviews), correlations, content and thematic analysis, and case studies. Each has distinctive strengths and limitations, and each must be evaluated for reliability and validity and conducted within the BPS ethical framework. This lesson also addresses sampling and what it means for psychology to be a science.
Key Definition: Non-experimental methods are research approaches that do not involve the direct manipulation of an independent variable. They describe, measure, and explore relationships between variables without, in themselves, establishing causation.
This lesson addresses the following points in AQA A-Level Psychology (7182), Section 4.2 (Research methods):
Assessment objectives engaged: AO1 (definitions and features), AO2 (applying methods, sampling and ethics to novel scenarios) and AO3 (evaluating reliability, validity and ethics). These questions are typically AO2/AO3-heavy.
Observational research involves watching and recording behaviour as it occurs. The categories below are not mutually exclusive — a single study could be, for example, a covert, non-participant, naturalistic, structured observation.
graph TD
A[Observational techniques] --> B[Setting]
A --> C[Awareness]
A --> D[Researcher role]
A --> E[Structure]
B --> B1[Naturalistic]
B --> B2[Controlled]
C --> C1[Covert]
C --> C2[Overt]
D --> D1[Participant]
D --> D2[Non-participant]
E --> E1[Structured: behavioural categories]
E --> E2[Unstructured]
| Type | Description | Strength | Limitation |
|---|---|---|---|
| Naturalistic | Behaviour observed in its natural setting, no intervention | High ecological validity; genuine behaviour | Low control; extraneous variables; hard to replicate |
| Controlled | Behaviour observed in a structured, set-up environment | Greater control; replicable | Lower ecological validity; behaviour may be artificial |
| Covert | Participants unaware they are observed | Reduces demand characteristics | No informed consent; privacy concerns |
| Overt | Participants know they are observed | Ethically preferable; consent possible | Behaviour may change (the Hawthorne effect) |
| Participant | Researcher joins the group | Rich, insider data | Loss of objectivity; "going native"; ethical issues |
| Non-participant | Researcher observes from outside | More objective; less influence on the group | May miss subtle, context-dependent behaviour |
| Structured | Uses pre-set behavioural categories and sampling | Systematic, quantitative; better inter-observer reliability | May miss unanticipated behaviours |
| Unstructured | Records all relevant behaviour, no framework | Captures richness and complexity | Hard to analyse; observer bias; lower reliability |
To make observation systematic, researchers operationalise behaviour into behavioural categories — a checklist of clearly defined, observable, mutually exclusive actions (e.g. for "aggression": hits, kicks, pushes, shouts). Good categories are objective (require no inference — "hits another child" rather than "is being mean"), mutually exclusive (a single act falls into only one category), and exhaustive enough to capture the behaviour of interest. They then decide how to sample behaviour over time:
For example, an observer studying playground aggression might use event sampling during a 20-minute break, tallying each instance of hitting, kicking, pushing or shouting against the agreed categories — a concrete procedure that also makes inter-observer reliability checkable.
The categories of observation interact in practice. A study might, for instance, be covert and participant and naturalistic — as in a researcher who secretly joins a religious group to observe it in its everyday setting. Each choice carries its own trade-off: covert observation gains natural behaviour but loses informed consent; participant observation gains insider depth but risks the researcher "going native" and losing objectivity; naturalistic observation gains ecological validity but loses control. Strong evaluation discusses these combined implications rather than treating each label in isolation.
When two or more observers record the same behaviour, they should agree. Inter-observer reliability is assessed by comparing their records. A simple percentage-agreement measure is
Agreement %=total number of observationsnumber of agreements×100
A figure of 80% or above is generally considered acceptable. More precisely, agreement can be expressed as a correlation between the two observers' tallies, with a coefficient of r≥+0.80 taken as good reliability. If agreement is low, behavioural categories must be redefined and observers retrained.
Exam Tip: When evaluating any observation, discuss inter-observer reliability and how to improve it — operationalise categories clearly, train observers together, and use video so records can be re-checked.
Self-report involves asking participants to report their own thoughts, feelings, attitudes or behaviour, via questionnaires or interviews.
A questionnaire is a pre-set list of written items. Items may be:
The choice of question type is therefore a trade-off between breadth and standardisation (closed) and depth and authenticity (open), and many questionnaires deliberately mix the two: closed items for the variables of central interest, plus a few open items to capture nuance.
| Strengths | Limitations |
|---|---|
| Reach large samples quickly and cheaply | Social desirability bias — answers people think are acceptable rather than true |
| Easy to replicate (standardised format) | Acquiescence bias — tendency to agree regardless of content |
| Closed-question quantitative data allow statistical analysis | Respondents may rush or misinterpret items with no one to clarify |
| Anonymity may improve honesty | Closed questions restrict depth |
| Type | Description | Strength | Limitation |
|---|---|---|---|
| Structured | Fixed questions in a set order | Standardised; replicable; easy to compare | Inflexible; may miss important points |
| Unstructured | Conversation develops freely | Rich, detailed, flexible | Hard to replicate/analyse; interviewer bias |
| Semi-structured | Set questions plus follow-ups | Balances structure and flexibility | Some loss of comparability; skill-dependent |
Key Definition: Social desirability bias occurs when participants give answers that present them in a favourable or socially acceptable light, rather than truthful ones.
The AQA specification expects you to be able to design self-report materials, so it is worth knowing what separates a good item from a poor one. Good questionnaire and interview questions are:
A Likert scale (e.g. strongly agree → strongly disagree) is the most common closed format; note that the data it yields are ordinal, because the psychological distance between "agree" and "strongly agree" is not guaranteed to equal that between "neutral" and "agree".
Exam Tip: Good self-report items are clear (no jargon, no double-barrelled questions), unbiased (no leading questions), and use filler questions to disguise the aim. When evaluating, weigh social desirability bias, the data type (quantitative vs qualitative), depth, and whether a researcher is present.
A correlation measures the strength and direction of the relationship between two co-variables. It is a technique of analysis, not strictly a method — the data can be gathered by questionnaire, observation or archival record.
Key Definition: A correlation is a measure of the relationship between two co-variables. A positive correlation means both rise together; a negative correlation means one rises as the other falls; a zero correlation means there is no systematic relationship.
| Type | Description | Scattergram pattern |
|---|---|---|
| Positive | Both co-variables increase together (e.g. study hours and exam marks) | Points slope up, left to right |
| Negative | One rises as the other falls (e.g. stress and immune function) | Points slope down, left to right |
| Zero | No systematic relationship | Points scattered randomly |
Strength is expressed as a correlation coefficient, a number on the scale
−1.0≤r≤+1.0
where +1.0 is a perfect positive relationship, 0 is no relationship, and −1.0 is a perfect negative relationship. The closer to ±1, the stronger the association: roughly, ∣r∣>0.7 is strong, 0.3–0.7 moderate, and <0.3 weak. Crucially, a negative coefficient is not a weak one — −0.85 is a strong (negative) relationship.
| Strengths | Limitations |
|---|---|
| Identify patterns and the strength/direction of a relationship | Cannot establish causation — it does not show one variable changes the other |
| Useful for generating hypotheses to test experimentally | Third-variable problem — an unseen variable may drive both co-variables |
| Can analyse variables that cannot be ethically manipulated | Often misinterpreted as proving causation |
| Quick using secondary data | Detects only linear trends — may miss curvilinear relationships |
It is essential to be clear about why correlations cannot establish cause and effect, because exams test it directly. There are two distinct reasons. First, the third-variable problem: a third, unmeasured variable may be driving both co-variables (the classic illustration is that ice-cream sales correlate with drowning, but temperature raises both). Second, even where the two co-variables are genuinely related, a correlation cannot tell us the direction of causality — if stress and poor sleep correlate, does stress disrupt sleep, or does poor sleep cause stress, or both? Only an experiment, which manipulates one variable and controls the rest, can resolve these. This is exactly why correlational findings are so often used to generate hypotheses that are then tested experimentally: the correlation flags a relationship worth investigating, and the experiment establishes whether it is causal.
It is also worth noting that a correlation coefficient near zero indicates only the absence of a linear relationship; a strong curvilinear relationship (such as the inverted-U linking arousal and performance in the Yerkes–Dodson law) could still exist and would be missed by a simple correlation, which is a further limitation of the technique.
Exam Tip: "Correlation does not equal causation" must appear in every correlation answer, illustrated by the third-variable problem — e.g. ice-cream sales correlate with drowning, but the third variable is temperature (hot weather raises both). For the top band, add the direction-of-causality problem and the point that correlations detect only linear relationships.
These techniques turn qualitative material (interview transcripts, diaries, media) into analysable data.
A worked illustration of content analysis: to study gender stereotyping in advertising, a researcher might code a sample of magazine adverts for whether the central figure is shown in a "domestic" or "professional" role, then count the frequencies for male and female figures. The qualitative source (the adverts) is thereby converted into quantitative data that can be compared and even tested statistically — and, because the coding scheme is explicit, a second coder can check inter-rater reliability.
| Strengths | Limitations |
|---|---|
| High ecological validity — uses real communication | Researcher bias — coding/themes reflect the analyst's interpretation |
| Easily replicable when coding units are explicit (content analysis) | Material is decontextualised, so meaning may be lost |
| Can study sensitive topics without direct contact with participants | Thematic analysis is hard to replicate and is subjective |
A case study is a detailed, in-depth investigation of a single individual, group, institution or event, usually drawing on several methods (interviews, observation, tests, records) and producing rich, mainly qualitative data over time.
Key Definition: A case study is an intensive, detailed investigation of a single case, typically using multiple methods over an extended period and often longitudinal.
Examples in psychology: Phineas Gage (frontal-lobe damage and personality change); HM / Henry Molaison, who developed severe anterograde amnesia after bilateral hippocampal removal, providing key evidence for the hippocampus in memory (Scoville & Milner, 1957); Little Hans (Freud, 1909), used to support psychoanalytic theory.
| Strengths | Limitations |
|---|---|
| Rich, in-depth data capturing real complexity | Cannot generalise from a single case |
| Can study rare/unique cases not recreatable experimentally | Researcher bias — subjective, selective interpretation |
| Often longitudinal — tracks change over time | Retrospective data may be inaccurate |
| Can challenge established theory with a single counter-example | Hard to replicate; confidentiality is difficult when the case is identifiable |
The case of HM illustrates both the strengths and the limitations vividly. Because HM's amnesia was unique and could never have been produced experimentally on ethical grounds, his case yielded irreplaceable insight into the role of the hippocampus in forming new long-term memories, and it directly challenged the prevailing view that memory was a single, unitary store — a single counter-example forcing theoretical change. Yet his findings rest on one individual with a very specific lesion, so generalisation is uncertain, and much of the data depended on the interpretation of the researchers who worked with him over decades. This is the central tension of the case-study method: unrivalled depth purchased at the cost of generalisability and objectivity, which is why case studies are most powerful when they complement larger, controlled studies rather than standing alone.
A sample is the group of participants actually studied, drawn from a target population. The aim is a representative sample so findings can be generalised.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.