You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
The methodology section of your language investigation explains how you collected your data and, crucially, why you collected it that way. It is short — typically only 300–400 words of your ~2,000-word report — but it does heavy lifting. A strong methodology reassures the moderator that your data collection was systematic, ethical and fit for your research question, rather than haphazard. It also sets up the evaluation you will return to in your conclusion, because every methodological choice carries limitations you will later weigh honestly. This lesson covers the main data-collection methods, the distinction between quantitative and qualitative approaches, sampling, the central importance of research ethics, the observer's paradox, and how to write the section up.
Before choosing a method, decide on your overall approach, because it shapes everything else.
Key Definition: Quantitative data — numerical data produced by counting and measuring linguistic features (e.g. frequency of hedges per 100 words). Qualitative data — interpretive data about how language functions in context (e.g. how a particular hedge softens a face-threatening act). Most strong investigations are mixed-methods: they count patterns and interpret what those patterns mean.
Quantitative analysis answers "how much / how often" and lets you present clear comparisons in tables; qualitative analysis answers "how and why," and is where the deeper AO3 marks live. Counting that "female speakers used 7.8 hedges per 100 words to males' 4.2" is quantitative; explaining that those hedges largely served a relational rather than an uncertainty function is qualitative. A mixed approach — count, then interpret — almost always outperforms either alone, because the numbers locate the pattern and the interpretation explains its significance.
Several main methods are available. Your choice should follow from your research question, not the other way round.
Recording spoken language and transcribing it suits topics involving:
Key Definition: Transcription — converting spoken language into written form using conventions that represent features such as pauses, overlaps, stress and intonation. The level of detail should match your focus: a lexical study needs only a light transcription; a study of turn-taking or prosody needs a far more detailed one.
| Symbol | Meaning |
|---|---|
| (.) | Micropause (under one second) |
| (2.0) | Timed pause, in seconds |
| // or [ | Onset of overlapping speech |
| = | Latching (no audible gap between turns) |
| CAPITALS | Increased volume or emphasis |
| :: | Elongated sound (e.g. "so::") |
| (( )) | Non-verbal or contextual notes |
| (xxx) | Unclear or uncertain transcription |
| ↑ ↓ | Marked rising or falling intonation |
Select only the conventions your investigation needs, and — importantly — apply them consistently, because inconsistent transcription quietly undermines the reliability of every count you later make.
Questionnaires collect attitudinal data — opinions, beliefs and perceptions about language. They suit accent and dialect attitudes, attitudes to language change, and matched-guise studies (listeners rate the same speaker recorded in two varieties).
Design principles:
| Question type | Example | Best for |
|---|---|---|
| Likert scale | "Rate this speaker 1–5 for competence" | Quantitative attitudinal data |
| Multiple choice | "Which accent do you associate with authority?" | Categorical data |
| Open response | "What do you think of regional accents in newsreading?" | Qualitative insight |
| Semantic differential | "Friendly ◁——▷ Unfriendly" | Attitudes on a continuum |
A standard caution to record in your evaluation: respondents may give socially desirable answers — what they think they ought to say about accents rather than what they truly feel — which is exactly why a matched-guise design, eliciting reactions indirectly, can reveal covert attitudes a direct question would miss.
A corpus is a large, structured collection of texts; corpus methods reveal patterns in frequency and distribution. The approach suits language change over time and comparison across genres.
Key Definition: Corpus — a large, systematically compiled, searchable collection of written or spoken texts. Established examples include the British National Corpus (BNC) and the Corpus of Contemporary American English (COCA).
You may lack access to major research corpora, but you can build a small one yourself — say, 30 newspaper headlines from each of three decades, or 40 product descriptions from one retailer. Free tools such as AntConc generate frequency lists, concordances and collocations; Google Books Ngram Viewer tracks word frequency in published books over time. Always state how you built and sampled the corpus.
Some investigations analyse pre-existing, publicly available texts — newspaper articles, advertisements, political speeches, published social-media posts, historical documents, song lyrics or scripts. This sidesteps the ethics of recording people, but limits you to written or scripted language. Even with public texts, anonymise private individuals and respect the platform's context.
Whatever the method, you must think about sampling — how you select the specific data you will analyse — and justify it explicitly.
| Concept | What it means for you |
|---|---|
| Representativeness | Your data should be typical of the context you claim to investigate |
| Sample size | Large enough to support a claim, small enough to analyse in depth within ~2,000 words |
| Systematic selection | State clearly how and why you chose these data (e.g. "every fifth reply," "first 40 posts in the thread") |
| Controlling variables | When comparing two sets, hold other variables constant where you can, so differences are attributable to the variable of interest |
Coursework Tip: Far more students drown in data than starve. A small, carefully selected dataset analysed thoroughly beats a large one you can only skim. Depth of analysis, not volume of data, is what the band descriptors reward.
Ethics are not an optional courtesy; they are integral to your methodology, and your teacher must approve your data-collection plan before you begin. Build the following principles into your design and report them explicitly.
Key Definition: Informed consent — the principle that participants must be fully informed about the nature, purpose and use of the research, and must voluntarily agree to take part, before any data is collected. Covert recording of private interaction breaches this principle and is not acceptable in an A-level investigation.
When you record naturally occurring speech you confront the observer's paradox, identified by William Labov: the aim is to observe how people speak when they are not being observed, yet the very act of observing alters their behaviour. People often shift toward more careful, self-conscious speech when they know a recorder is running.
You cannot eliminate the paradox, but you can mitigate it — and, importantly, you should acknowledge it in your evaluation rather than pretend your data are perfectly natural:
Two further concepts belong in a strong methodology.
| Concept | Definition | How to strengthen it |
|---|---|---|
| Reliability | Consistency — would another researcher using your method get similar results? | Use systematic, documented procedures; apply transcription conventions and coding categories consistently; describe the method clearly enough to be replicated |
| Validity | Whether your method actually measures what it claims to | Ensure the method genuinely fits the research question; use triangulation (combining methods or data sources) where feasible; acknowledge limitations honestly |
If you code your data — for example, sorting questions into "open," "closed" and "leading" — define each category precisely and apply it uniformly, because vague or shifting categories destroy reliability and make your counts meaningless.
| Topic type | Recommended method(s) | Rationale |
|---|---|---|
| Spoken interaction | Recording + transcription | Captures the detail of talk in real time |
| Child language acquisition | Recording + transcription (+ field notes) | Needs an accurate record of the child's utterances |
| Language attitudes | Questionnaire / matched-guise | Elicits perceptions and (covert) attitudes |
| Language change | Corpus / existing texts | Needs comparable data from different periods |
| Written genre analysis | Existing texts | The data already exist in written form |
| Language and technology | Existing texts (captured posts/threads) | Digital language is already written |
| Language and power | Recording, or existing texts | Depends on whether the context is spoken or written |
Because so many investigations rest on transcribed speech, it repays a closer look at how to transcribe well — transcription is not a neutral, clerical step but an analytical one, and the choices you make there shape every claim that follows. Three principles matter most.
First, match the grain of your transcription to your research question. Transcription exists on a spectrum from light to fine. A light (orthographic-plus) transcription captures the words, speaker turns, and a few salient features — major pauses, overlaps, emphasis — and is ample for a study of lexis or turn-taking. A fine transcription adds detailed timings, micropauses, intonation contours and, in extreme cases, phonetic detail in the IPA; you need it only if prosody or pronunciation is your actual object of study. The error to avoid is mismatch in either direction: a lexical study buried under needless phonetic notation wastes effort and obscures the analysis, while a prosody study transcribed only orthographically simply lacks the data to make its case.
Second, transcribe consistently or your counts collapse. If you mark a one-second gap as a micropause in one place and ignore it in another, any later claim about "frequency of pauses before disagreement" is built on sand. Fix your conventions before you start, keep a key, and apply them mechanically. This is the practical face of reliability: another researcher should be able to take your recording and your key and produce a closely similar transcript.
Third, acknowledge that transcription involves interpretation. Deciding whether an utterance is "(.)" or "(0.5)", whether a sound is emphasis or merely volume, or where one turn ends and another begins, all involve judgement. The strongest investigations concede this openly in evaluation — noting, for instance, that borderline cases were resolved by a stated rule, and that a second transcriber might have decided some of them differently. That candour is not weakness; it is exactly the evaluative maturity the descriptors reward.
Coursework Tip: Transcribe a short trial extract early, then show it to your teacher. It is far better to discover that your convention set is too heavy, too light, or internally inconsistent on a two-minute sample than to find it out after transcribing fifteen minutes of speech.
Most mixed-methods investigations involve some counting, and counting done carelessly produces conclusions that a moderator will distrust. A few disciplines keep your quantitative work honest.
Normalise before you compare. Raw counts mislead whenever the things being compared are of different sizes. If one speaker talks for 600 words and another for 300, a raw tally of "12 hedges versus 8" tells you little; expressed as a rate — hedges per 100 words — the picture (2.0 versus 2.67) may even reverse. Whenever your two data sets differ in length, convert counts to rates per fixed unit (per 100 words, per minute, per turn) and say so.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.