You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Named Entity Recognition (NER) is the task of identifying and classifying named entities — specific real-world objects such as people, organisations, locations, dates, and more — within text. NER is a fundamental building block for information extraction, question answering, and knowledge graph construction.
| Entity Type | Tag | Examples |
|---|---|---|
| Person | PER / PERSON | "Albert Einstein", "Marie Curie" |
| Organisation | ORG | "Google", "United Nations", "Oxford University" |
| Location | LOC / GPE | "London", "Mount Everest", "France" |
| Date / Time | DATE / TIME | "14 March 2023", "next Monday" |
| Money | MONEY | "£50", "$1.2 million" |
| Percentage | PERCENT | "25%", "three quarters" |
| Product | PRODUCT | "iPhone", "Windows 11" |
| Event | EVENT | "World Cup", "COP26" |
Uses hand-crafted patterns and gazetteers (lists of known entities).
| Advantage | Disadvantage |
|---|---|
| No training data needed | Cannot generalise to unseen entities |
| High precision for known patterns | Extremely labour-intensive |
| Fully explainable | Breaks with new domains |
| Algorithm | Description |
|---|---|
| Hidden Markov Model (HMM) | Generative sequence model |
| Conditional Random Fields (CRF) | Discriminative sequence model — considers neighbouring labels |
| Maximum Entropy Markov Model | Discriminative model with feature engineering |
Features typically include:
Modern NER systems use neural networks — particularly BiLSTM-CRF and Transformer-based models.
| Model | Description |
|---|---|
| BiLSTM-CRF | Bi-directional LSTM + CRF layer for sequence labelling |
| BERT + Token Classification | Pre-trained transformer fine-tuned for NER |
| spaCy NER | Efficient CNN/Transformer-based NER built into spaCy |
import spacy
nlp = spacy.load("en_core_web_sm")
text = "Apple was founded by Steve Jobs in Cupertino, California on 1 April 1976."
doc = nlp(text)
for ent in doc.ents:
print(f"{ent.text:20s} {ent.label_:10s} {spacy.explain(ent.label_)}")
Output:
Apple ORG Companies, agencies, institutions, etc.
Steve Jobs PERSON People, including fictional
Cupertino GPE Countries, cities, states
California GPE Countries, cities, states
1 April 1976 DATE Absolute or relative dates or periods
| Label | Description | Example |
|---|---|---|
| PERSON | People | "Steve Jobs" |
| ORG | Organisations | "Apple" |
| GPE | Geopolitical entities | "California" |
| LOC | Non-GPE locations | "Mount Everest" |
| DATE | Dates | "1 April 1976" |
| TIME | Times | "3:30 pm" |
| MONEY | Monetary values | "$500" |
| CARDINAL | Numerals | "three" |
| ORDINAL | Ordinal numbers | "first" |
| NORP | Nationalities, religions, political groups | "British" |
from transformers import pipeline
ner_pipeline = pipeline("ner", model="dbmdz/bert-large-cased-finetuned-conll03-english", aggregation_strategy="simple")
text = "Albert Einstein was born in Ulm, Germany and later worked at Princeton University."
entities = ner_pipeline(text)
for ent in entities:
print(f"{ent['word']:20s} {ent['entity_group']:10s} {ent['score']:.4f}")
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.