You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Once data have been collected, they must be organised, summarised, and analysed in order to draw meaningful conclusions. At A-Level, you need to understand the distinction between different types of data, measures of central tendency and dispersion, methods of data presentation, distributions, and levels of measurement. These concepts are fundamental to evaluating research findings and conducting your own investigations.
Key Definition: Data analysis is the process of organising, summarising, and interpreting collected data to identify patterns, draw conclusions, and evaluate hypotheses.
| Type | Description | Example | Strengths | Limitations |
|---|---|---|---|---|
| Quantitative | Numerical data that can be measured and analysed statistically | Reaction times (ms), test scores, number of errors | Objective; easy to analyse; allows statistical testing; comparisons between groups | May lack depth; may oversimplify complex behaviours |
| Qualitative | Non-numerical data expressed in words, descriptions, or themes | Interview transcripts, diary entries, open-ended survey responses | Rich and detailed; captures the meaning and complexity of behaviour | Subjective; difficult to analyse and compare; researcher interpretation bias |
Exam Tip: Many studies collect both types of data. For example, a study on stress might measure cortisol levels (quantitative) and also interview participants about their experience (qualitative). Combining both is called methodological triangulation and increases the validity of the research.
| Type | Description | Strengths | Limitations |
|---|---|---|---|
| Primary data | Data collected directly by the researcher for the specific purpose of the study | Directly relevant to the research question; researcher has control over collection methods | Time-consuming; expensive; may have small sample sizes |
| Secondary data | Data that has already been collected by someone else for a different purpose (e.g., government statistics, published studies, medical records) | Large datasets readily available; cost-effective; can be used for longitudinal analysis | May not be directly relevant; researcher has no control over how data were collected; may be outdated |
Measures of central tendency summarise a dataset by identifying the most "typical" or "central" value:
| Measure | Calculation | Strengths | Limitations |
|---|---|---|---|
| Mean | Sum of all values ÷ number of values | Uses all data points; most sensitive and powerful measure; suitable for interval/ratio data | Distorted by extreme values (outliers); may produce a value that does not exist in the dataset (e.g., 2.4 children) |
| Median | The middle value when data are arranged in order (for even-numbered datasets, the mean of the two middle values) | Not affected by outliers; good for skewed distributions; easy to calculate | Does not use all data points; less sensitive than the mean; can be less representative of the full dataset |
| Mode | The most frequently occurring value | Can be used with nominal (categorical) data; represents an actual data value; easy to identify | May not be representative; a dataset may have multiple modes (bimodal/multimodal) or no mode at all; not useful for further statistical analysis |
Key Definition: The mean is the arithmetic average, calculated by dividing the sum of all values by the number of values.
Worked Example:
Data: 3, 5, 7, 8, 8, 10, 12, 14, 47
Notice that the mean (12.67) is pulled upward by the extreme value of 47, making it unrepresentative of the majority of the data. In this case, the median (8) provides a better summary.
Exam Tip: If a dataset contains outliers or is skewed, recommend the median as the most appropriate measure of central tendency. If the data are normally distributed and measured on an interval or ratio scale, the mean is preferred because it uses all data points.
Measures of dispersion describe how spread out the data are:
| Measure | Calculation | Strengths | Limitations |
|---|---|---|---|
| Range | Highest value − lowest value (sometimes +1 is added) | Quick and easy to calculate | Only uses two data points; heavily influenced by outliers; ignores the distribution of data between extremes |
| Standard deviation (SD) | A measure of the average amount by which each data point differs from the mean | Uses all data points; more precise and informative than the range; essential for further statistical analysis | More difficult to calculate; affected by outliers (though less so than the range); requires interval/ratio data |
Key Definition: The standard deviation is a measure of dispersion that indicates the average distance of each data point from the mean. A small SD indicates data points are clustered close to the mean; a large SD indicates they are widely spread out.
Interpreting standard deviation:
Worked Example:
Data: 4, 6, 7, 8, 10
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.