Data Handling and Analysis

Once data have been collected, they must be organised, summarised and analysed before any conclusions can be drawn. Raw scores in a table tell us very little; we need ways to capture the typical value, the spread of values, the shape of the distribution, and the right way to display the pattern. At A-Level you must understand the types of data, the measures of central tendency and dispersion (including the standard deviation and its formula), how to present data, normal and skewed distributions, and the four levels of measurement — which together determine how data can legitimately be analysed and, ultimately, which statistical test is appropriate.

Key Definition: Data analysis is the process of organising, summarising and interpreting collected data to identify patterns, draw conclusions and evaluate hypotheses.

Spec Mapping

This lesson addresses the following points in AQA A-Level Psychology (7182), Section 4.2 (Research methods):

Quantitative and qualitative data; primary and secondary data; meta-analysis.
Descriptive statistics: measures of central tendency — mean, median, mode; measures of dispersion — range and standard deviation; calculation of percentages.
Presentation and display of quantitative data: graphs, tables, scattergrams, bar charts, histograms.
Distributions: normal and skewed distributions; characteristics of normal and skewed distributions.
Levels of measurement: nominal, ordinal and interval.

Assessment objectives engaged: AO1 (definitions and properties), AO2 (selecting, calculating and interpreting the right statistic/graph for a given dataset or scenario) and AO3 (evaluating the appropriateness of a measure or display). These questions are application-heavy and may require short calculations.

Types of Data

Quantitative vs Qualitative Data

Type	Description	Example	Strengths	Limitations
Quantitative	Numerical data, measurable and analysable statistically	Reaction times (ms), test scores, error counts	Objective; easy to analyse and compare; allows statistical testing	May lack depth; can oversimplify complex behaviour
Qualitative	Non-numerical data in words, descriptions or themes	Interview transcripts, diaries, open responses	Rich, detailed; captures meaning and complexity	Subjective; harder to analyse/compare; interpretation bias

The distinction is not just descriptive but reflects a deeper methodological divide. Quantitative approaches sit comfortably with psychology's scientific aspirations — objectivity, measurement, statistical analysis and comparison across many people — but risk reducing rich human experience to a bare number. Qualitative approaches preserve meaning, context and the participant's own perspective, but are harder to analyse systematically and more open to the researcher's interpretation. Neither is inherently superior; the right choice depends on the research question. A study of how many people relapse after therapy calls for quantitative data, whereas a study of what relapse feels like calls for qualitative data.

Exam Tip: Many studies collect both. A stress study might measure cortisol (quantitative) and interview participants (qualitative). Using more than one method or data type — triangulation — strengthens validity.

Primary vs Secondary Data and Meta-Analysis

Type	Description	Strengths	Limitations
Primary data	Collected first-hand by the researcher for this study	Directly relevant; control over collection	Time-consuming; costly; possibly small samples
Secondary data	Already collected by others for another purpose (e.g. official statistics, prior studies, records)	Large datasets, cheap, allows longitudinal use	May not fit the question; no control over quality; may be outdated

A meta-analysis combines the quantitative results of many separate studies on the same question — often by calculating an overall effect size — to reach a more reliable conclusion than any single study. Its strength is a very large effective sample and greater power; its weakness is publication bias (the file-drawer problem), since unpublished null results are missing, and the inclusion of methodologically weak studies can distort the overall figure.

Choosing between primary and secondary data is a genuine design decision. A developmental psychologist wanting precise, purpose-built measures of children's attachment would collect primary data through their own observations, accepting the cost in time and money. A researcher investigating long-term trends in suicide rates, by contrast, would draw on secondary data from national records — instantly providing decades of data across whole populations, at the price of having no control over how "suicide" was recorded or which variables were captured. The two are not mutually exclusive: many studies collect primary data and contextualise it against secondary sources, and a meta-analysis is itself a sophisticated use of secondary data.

Measures of Central Tendency

Measures of central tendency summarise a dataset with a single "typical" or "central" value.

Key Definition: The mean is the arithmetic average, found by dividing the sum of all values by the number of values:

$\bar{x} = \frac{\sum x}{n}$

where $\bar{x}$ is the mean, $\sum x$ is the sum of all the scores, and $n$ is the number of scores.

Measure	How calculated	Strengths	Limitations
Mean	$\bar{x} = \frac{\sum x}{n}$	Uses all data; most sensitive; needed for further analysis (interval data)	Distorted by outliers; may give an "impossible" value (e.g. 2.4 children)
Median	Middle value when ranked (mean of the two middle values if $n$ is even)	Not affected by outliers; good for skewed/ordinal data	Ignores most of the data; less sensitive than the mean
Mode	Most frequent value	The only one usable with nominal data; is a real value	May be unrepresentative; can be bimodal or absent

Worked example. Data: 3, 5, 7, 8, 8, 10, 12, 14, 47

Mean $= \frac{3+5+7+8+8+10+12+14+47}{9} = \frac{114}{9} = 12.67$
Median $=$ the 5th (middle) value when ranked $= 8$
Mode $= 8$ (the only repeated value)

Here the mean (12.67) is dragged upward by the outlier 47 and misrepresents the bulk of the data, so the median (8) is the better summary.

Exam Tip: With outliers or a skewed distribution, recommend the median. With normally distributed interval/ratio data, prefer the mean, because it uses every data point and feeds parametric tests.

Matching the Measure to the Level of Measurement

A frequent exam task is to justify which measure of central tendency to report, and the level of measurement (covered later in this lesson) is the deciding factor:

Nominal data (named categories, e.g. preferred therapy type) can only be summarised by the mode, because categories cannot be ranked or added — it is meaningless to "average" the labels.
Ordinal data (ranked but unequal intervals, e.g. positions or rating-scale scores) are best summarised by the median, because the unequal gaps make the mean unreliable, while ranking still defines a meaningful middle.
Interval/ratio data (equal, measured intervals, e.g. reaction times) are best summarised by the mean, which uses every value and supports the most powerful (parametric) analyses — unless outliers or skew make the median safer.

A second worked example shows how the choice can change the story the data tell. Suppose seven participants take a memory test (scores out of 20): 4, 6, 6, 7, 8, 9, 20.

Mode $= 6$ (the only repeated score)
Median $=$ the 4th (middle) value when ranked $= 7$
Mean $= \frac{4+6+6+7+8+9+20}{7} = \frac{60}{7} = 8.57$

The single high score of 20 inflates the mean to 8.57, well above six of the seven participants, whereas the median (7) and mode (6) sit comfortably among the bulk of the data. Reporting only the mean would overstate typical performance — a concrete illustration of why the choice of measure matters and why outliers must always be inspected.

Measures of Dispersion

Measures of dispersion describe how spread out the data are.

Measure	What it is	Strengths	Limitations
Range	Highest − lowest value (sometimes +1)	Quick and easy	Uses only two values; distorted by outliers; ignores the middle of the data
Standard deviation (SD)	Average distance of each value from the mean	Uses all data; far more informative; underpins parametric tests	Harder to calculate; affected by outliers (less than the range); needs interval data

Key Definition: The standard deviation is a measure of dispersion indicating the average distance of each data point from the mean. A small SD means scores cluster tightly around the mean; a large SD means they are widely spread.

The standard deviation is calculated as

$s = \sqrt{\frac{\sum (x - \bar{x})^2}{n - 1}}$

where $x$ is each score, $\bar{x}$ is the mean, and $n$ is the number of scores. In words: find each score's deviation from the mean $(x - \bar{x})$ , square it (so negatives do not cancel), sum the squares, divide by $n-1$ (giving the variance), then take the square root to return to the original units.

Two design features of this formula are worth understanding. The deviations are squared for a precise reason: simply adding the raw deviations would always give zero, because positive and negative deviations cancel exactly around the mean; squaring removes the signs so that the spread does not vanish, with the side-effect that large deviations are weighted especially heavily. Taking the square root at the end then undoes the squaring, returning the answer to the original measurement units (so an SD of reaction times is in milliseconds, not "milliseconds squared"). The intermediate quantity, the variance $\left(s^2 = \frac{\sum (x-\bar{x})^2}{n-1}\right)$ , is itself a valid measure of spread and is important in more advanced statistics, but at A-Level the standard deviation is preferred precisely because it is in interpretable units.

Worked example. Data: 4, 6, 7, 8, 10

Mean: $\bar{x} = \frac{4+6+7+8+10}{5} = 7$
Deviations $(x-\bar{x})$ : $-3, -1, 0, +1, +3$
Squared deviations $(x-\bar{x})^2$ : $9, 1, 0, 1, 9$
Sum of squares: $\sum (x-\bar{x})^2 = 20$
Variance: $\frac{20}{5-1} = 5$
Standard deviation: $s = \sqrt{5} = 2.24$

Interpreting SD: a small SD means participants responded similarly and the mean represents them well; a large SD signals high variability, so the mean is a less trustworthy summary and there may be important individual differences. Because the SD is in the same units as the original data, it can be interpreted concretely: a reaction-time mean of 320 ms with an SD of 15 ms tells us most participants responded within roughly 305–335 ms, whereas the same mean with an SD of 90 ms tells us responses were scattered far more widely. This is also why, in a normal distribution, the SD is so powerful: about 68% of scores lie within $\pm 1$ SD of the mean and about 95% within $\pm 2$ SD, so the SD effectively maps out where most of the data sit.

The range in more detail. The range is the crudest measure of dispersion because it depends entirely on the two most extreme scores and ignores everything in between. Two datasets can share an identical range yet have utterly different spreads: for the sets {2, 2, 2, 2, 10} and {2, 4, 6, 8, 10} the range is 8 in both cases, yet the first is tightly bunched at the low end with one outlier, while the second is evenly spread. Adding 1 to the range (giving the highest minus lowest plus one) is sometimes done to account for the fact that measured values represent rounded intervals — for example, scores recorded as whole numbers actually span half a unit either side. The range is nonetheless useful as a quick, easily understood indication of spread, particularly for small datasets or as a first check before computing the SD.

Exam Tip: If asked to compare two conditions, look at both the means and the SDs. Two groups can share a mean yet differ greatly in spread — the SD reveals consistency that the mean alone conceals. Quoting the SD in the original units (e.g. "most scores fell within one SD, i.e. 305–335 ms") demonstrates genuine understanding rather than rote recall.

Calculating Percentages

Converting raw frequencies to percentages standardises data so different-sized groups can be compared:

$\text{percentage} = \frac{\text{part}}{\text{whole}} \times 100$

For example, if 18 of 24 participants recalled a word, that is $\frac{18}{24} \times 100 = 75\%$ . To find a percentage of a quantity, multiply by the decimal equivalent (e.g. 30% of 80 is $0.30 \times 80 = 24$ ). Percentages also make change easy to express: if a mean error rate fell from 40 to 30, that is a fall of 10 percentage points but a $\frac{40-30}{40}\times 100 = 25\%$ reduction relative to the starting figure — a distinction examiners sometimes probe, and one that matters when reporting the size of an effect.

Choosing the Right Descriptive Statistic

Bringing the measures together, the choice of descriptive statistic follows directly from the level of measurement and the shape of the data:

graph TD
    A[What level of measurement?] --> B[Nominal]
    A --> C[Ordinal]
    A --> D[Interval / ratio]
    B --> B1[Central tendency: MODE<br/>Display: bar chart]
    C --> C1[Central tendency: MEDIAN<br/>Dispersion: range<br/>Display: bar chart]
    D --> D2{Normally distributed,<br/>no outliers?}
    D2 -->|Yes| D3[Central tendency: MEAN<br/>Dispersion: standard deviation<br/>Display: histogram]
    D2 -->|No / skewed / outliers| D4[Central tendency: MEDIAN<br/>Dispersion: range<br/>Display: histogram]

Presentation of Data

The display must match the data type and purpose.

Display	Used for	Key feature
Bar chart	Categorical/nominal data, or summary values (e.g. condition means)	Discrete bars with gaps; height = frequency/value
Histogram	Continuous data	Bars with no gaps; x-axis shows continuous intervals (bins); area represents frequency
Scattergram	Correlational data (two co-variables)	One dot per pair; pattern shows direction/strength; a line of best fit can be added
Frequency table	Summarising raw data before graphing/calculation	Each value/interval listed with its frequency

Reading and Interpreting Graphs

Producing a graph is only half the skill; the exam frequently asks you to interpret one. For a bar chart comparing condition means, the key is to compare bar heights and comment on the size of the difference, ideally relating it back to the hypothesis. For a histogram, the shape matters: a symmetrical bell indicates a normal distribution, a bunching to the left with a tail to the right indicates positive skew, and a long left tail indicates negative skew. For a scattergram, you read off the direction (do the points slope up, down, or show no pattern?) and the strength (how tightly do they cluster around an imaginary line of best fit?). A common error is to over-interpret: a difference in bar heights does not, by itself, establish statistical significance — that requires an inferential test (the next lesson), and a scattergram shows correlation, never causation.

From Raw Data to Graph: A Worked Sequence

Suppose 30 participants' reaction times (ms) are recorded. The raw list is unintelligible, so the researcher first builds a frequency table by grouping scores into equal class intervals (e.g. 200–249, 250–299, 300–349, …), counting how many fall in each. This table can then be turned into a histogram, with the continuous time intervals on the x-axis and frequency on the y-axis and no gaps between bars. The shape that emerges (likely a positive skew for reaction times) tells the researcher at a glance which measure of central tendency to report. Had the data instead been categorical — say, the number of participants choosing each of three response strategies — a bar chart (with gaps) would be correct, and a scattergram would be required only if two co-variables were being related. This sequence — raw data → frequency table → appropriate graph → choice of summary statistic — is the standard workflow the exam expects you to understand.

Exam Tip: Bar charts have gaps (categories are separate); histograms have no gaps (data are continuous). Always title the graph and label both axes with units. Choosing the wrong graph type — or claiming a graphed difference is "significant" without a test — loses marks.

Data Handling and Analysis

Data Handling and Analysis

Spec Mapping

Types of Data

Quantitative vs Qualitative Data

Primary vs Secondary Data and Meta-Analysis

Measures of Central Tendency

Matching the Measure to the Level of Measurement

Measures of Dispersion

Calculating Percentages

Choosing the Right Descriptive Statistic

Presentation of Data

Reading and Interpreting Graphs

From Raw Data to Graph: A Worked Sequence

Normal and Skewed Distributions

Normal Distribution

More in Psychology