You are viewing a free preview of this lesson.
Subscribe to unlock all 4 lessons in this course and every other course on LearningBro.
Research methods is the most heavily examined single strand of AQA A-Level Psychology, yet it is the one students most often under-prepare. The Research Methods section on Paper 2 alone is worth 48 marks — half the paper — and methods questions are also embedded in every section of Papers 1 and 3. On top of this, AQA mandates that at least 10% of all marks assess mathematical skills, and those marks are overwhelmingly clustered here. The good news is that research-methods and maths marks are the most predictable on the whole qualification: the questions recur in fixed forms ("identify the design", "justify a statistical test", "calculate the percentage", "carry out a sign test"), and the technique for each is learnable. This lesson teaches that technique with worked numerical examples and KaTeX formulae, plus a decision tree for choosing statistical tests and banded model answers showing what full-mark working looks like.
| Skill cluster | AO | Where assessed |
|---|---|---|
| Experimental methods, designs, sampling | AO1 / AO2 / AO3 | Paper 2 Section C; embedded in Papers 1 & 3 |
| Descriptive statistics & data handling | AO2 (maths) | Paper 2 Section C; data items anywhere |
| Choosing & justifying inferential tests | AO2 | Paper 2 Section C |
| Probability, significance, Type I/II errors | AO1 / AO2 | Paper 2 Section C |
| The sign test (the one you must calculate) | AO2 (maths) | Paper 2 Section C |
| Percentages, fractions, ratios, SD, sig figs | AO2 (maths) | Minimum 10% of marks across all papers |
| Ethics, peer review, scientific process | AO1 / AO3 | All papers |
Key Point: Treat research methods as a cross-cutting skill, not a Paper 2 silo. A "Design a study to investigate conformity" question can appear in the Social Influence section of Paper 1; a "justify your statistical test" question can appear inside a Paper 3 option.
| Type | Description | Strengths | Limitations |
|---|---|---|---|
| Laboratory experiment | Controlled environment; researcher manipulates the IV, measures the DV | High control; easy to replicate; can establish cause and effect | Low ecological validity; demand characteristics; artificial |
| Field experiment | Natural setting; researcher still manipulates the IV | Higher ecological validity; more natural behaviour | Less control; harder to replicate; ethical issues (no consent) |
| Natural experiment | The IV occurs naturally (not manipulated); the DV is measured | Allows study of variables unethical/impractical to manipulate (e.g. institutionalisation) | No cause and effect; confounding variables |
| Quasi-experiment | IV is an existing participant characteristic (age, gender, diagnosis) | Allows comparison of pre-existing groups | No random allocation; differences may be confounds |
The design determines how participants are allocated to conditions.
| Design | Each participant... | Strengths | Limitations | Fix |
|---|---|---|---|---|
| Independent groups | takes part in ONE condition | No order effects; less likely to guess aim | Individual differences confound; needs more participants | Random allocation |
| Repeated measures | takes part in ALL conditions | Controls individual differences; fewer participants | Order effects; demand characteristics | Counterbalancing (ABBA) |
| Matched pairs | is paired then split across conditions | Reduces individual differences; no order effects | Time-consuming; cannot match on everything | Match on the most relevant variable |
Worked technique — "Identify the design": Read the procedure and ask one question — did the same people do both conditions? If yes, it is repeated measures (or, if pre-paired, matched pairs); if different people did each condition, it is independent groups. State the design and one justification drawn from the stem.
The sample is drawn from the target population; the method determines representativeness.
| Method | How it works | Strengths | Limitations |
|---|---|---|---|
| Random | Every member has an equal chance (names from a hat, random-number generator) | Free from researcher bias; likely representative | Time-consuming; selected people may decline |
| Systematic | Every nth person on a list | Objective; easy | A periodic pattern in the list can bias it |
| Stratified | Population split into strata; randomly sampled in proportion | Highly representative of key characteristics | Very time-consuming; needs detailed population data |
| Opportunity | Whoever is available and willing | Quick, easy, cheap | Highly biased; over-represents one group |
| Volunteer | Participants self-select (advert) | Willing participants; good for sensitive topics | Volunteer bias (more motivated/extravert) |
Exam Tip: When evaluating a sampling method, always address (a) representativeness of the target population and (b) any systematic bias the method introduces.
| Distinction | Type A | Type B |
|---|---|---|
| Quantitative vs Qualitative | Quantitative: numerical; analysed statistically; objective but may lose depth | Qualitative: words/themes; rich and detailed but subjective and hard to replicate |
| Primary vs Secondary | Primary: collected first-hand for this study; tailored but costly | Secondary: pre-existing (stats, records); cheap but may not fit the question |
Not every study is an experiment. AQA examines several non-experimental techniques, and a common question asks you to evaluate or design one.
| Type | Meaning | Note |
|---|---|---|
| Naturalistic vs controlled | In a natural setting vs a structured environment | Naturalistic = high ecological validity, low control |
| Covert vs overt | Participants unaware vs aware they are observed | Covert reduces demand characteristics but raises ethics (consent) |
| Participant vs non-participant | Observer joins the group vs stays apart | Participant gives insight but risks losing objectivity |
Observations are made systematic using behavioural categories (clearly operationalised, observable actions) and sampling methods such as event sampling (count each occurrence of a behaviour) or time sampling (record what is happening at fixed intervals). Inter-observer reliability — agreement between two observers — is checked by correlating their records.
Questionnaires and interviews (structured, semi-structured, unstructured) gather data directly from participants. Strengths include access to large samples (questionnaires) and rich detail (unstructured interviews); limitations include social desirability bias and response sets. Good design avoids leading questions, double-barrelled questions and jargon.
A correlation examines the relationship between two co-variables; it does not manipulate an IV, so it cannot establish cause and effect. The strength and direction are summarised by a correlation coefficient between −1 and +1: a value near +1 is a strong positive correlation, near −1 a strong negative correlation, and near 0 no correlation. The classic evaluation point is the third-variable problem — an apparent relationship between two co-variables may be driven by an unmeasured third factor.
These two concepts are examinable in their own right and provide ready-made AO3 across the whole course.
| Concept | Question it answers | Types / checks |
|---|---|---|
| Reliability | Is the measure consistent? | Test-retest (same test, two occasions); inter-observer (two observers agree) |
| Validity | Does it measure what it claims? | Internal (free of confounds); external (ecological, population, temporal); face and concurrent validity |
Ways to improve reliability include standardising procedures, operationalising behavioural categories and training observers. Ways to improve validity include controlling extraneous variables, using a control group, and checking a new measure against an established one (concurrent validity).
| Measure | Method | Strength | Limitation |
|---|---|---|---|
| Mean | sum ÷ number of values | Uses all data; most sensitive | Distorted by outliers |
| Median | middle value when ordered | Unaffected by outliers | Ignores most data |
| Mode | most frequent value | Works for categorical data | May not exist / may be multiple |
The mean of a data set is defined as:
xˉ=n∑x
where ∑x is the sum of all scores and n is the number of scores. For the data set 4, 7, 7, 9, 13:
xˉ=54+7+7+9+13=540=8
The median is the middle value (7) and the mode is the most frequent value (7).
| Measure | Method | Strength | Limitation |
|---|---|---|---|
| Range | highest − lowest (sometimes +1) | Quick | Uses only two values; sensitive to outliers |
| Standard deviation | average distance of scores from the mean | Uses all data; precise | Harder to compute; affected by outliers |
The standard deviation measures the average spread of scores around the mean. One common form is:
σ=n∑(x−xˉ)2
Interpreting it is the examinable skill: a large standard deviation means scores are widely spread out from the mean (more variability); a small standard deviation means scores cluster tightly around the mean (more consistency). For the data above (xˉ=8), the squared deviations are 16,1,1,1,25, giving:
σ=544=8.8≈2.97
Exam Tip: You must be able to calculate the mean, median, mode and range. For standard deviation, AQA expects you to understand and interpret it (high SD = spread out; low SD = clustered) and to recognise the formula, rather than to compute it unaided under exam conditions.
Choosing a test depends on three questions, asked in order:
graph TD
A[What is the hypothesis?] --> B[Test of Difference]
A --> C[Test of Correlation]
B --> D{Related or Unrelated?}
D --> E[Related Design]
D --> F[Unrelated Design]
E --> G{Level of Data?}
F --> H{Level of Data?}
G --> I[Nominal: Sign Test]
G --> J[Ordinal: Wilcoxon]
G --> K[Interval: Related t-test]
H --> L[Nominal: Chi-squared]
H --> M[Ordinal: Mann-Whitney U]
H --> N[Interval: Unrelated t-test]
C --> O{Level of Data?}
O --> P[Ordinal: Spearman's rho]
O --> Q[Interval: Pearson's r]
| Test | Difference or Correlation | Design | Level of Data |
|---|---|---|---|
| Sign test | Difference | Related | Nominal |
| Wilcoxon signed-rank | Difference | Related | At least ordinal |
| Related t-test | Difference | Related | Interval |
| Chi-squared | Difference / association | Unrelated | Nominal |
| Mann-Whitney U | Difference | Unrelated | At least ordinal |
| Unrelated t-test | Difference | Unrelated | Interval |
| Spearman's rho | Correlation | N/A | At least ordinal |
| Pearson's r | Correlation | N/A | Interval |
A useful mnemonic for the difference tests, reading down the related column then the unrelated column, is "Carrots Should Come Mashed With Swede Under Roast Potatoes" — but the decision tree is more reliable than any rhyme.
| Level | Description | Example |
|---|---|---|
| Nominal | Named categories; frequency counts | How many chose A vs B |
| Ordinal | Can be ranked; unequal intervals | Race positions; Likert ratings |
| Interval | Equal intervals on a scale | Time in seconds; standardised test scores |
Exam Tip: The classic question — "Identify a suitable statistical test and justify your choice" — wants the test plus three justifications: the hypothesis (difference/correlation), the design (related/unrelated), and the level of data (nominal/ordinal/interval). Drop any one and you cap the marks.
The conventional significance level in psychology is p≤0.05 (5%). This means:
A more stringent level such as p≤0.01 is used where a Type I error would be especially costly (for example, in drug research).
| Error | What happens | More likely when | Nickname |
|---|---|---|---|
| Type I | Reject a true null hypothesis (false positive) | Significance level too lenient (e.g. p≤0.10) | Optimistic error |
| Type II | Retain a false null hypothesis (false negative) | Significance level too stringent (e.g. p≤0.01) | Pessimistic error |
Key Point: p≤0.05 is a deliberate compromise. Relaxing to p≤0.10 raises the Type I risk; tightening to p≤0.01 raises the Type II risk. There is no level that minimises both at once.
For most tests you compare a calculated value with a critical value from a table (using N, the significance level, and whether the hypothesis is one- or two-tailed). The rule differs by test family, so learn it explicitly:
Subscribe to continue reading
Get full access to this lesson and all 4 lessons in this course.