What is Data Visualisation

Data visualisation is the graphical representation of information and data. By using visual elements like charts, graphs, and maps, data visualisation tools provide an accessible way to see and understand trends, outliers, and patterns in data. It sits at the intersection of data science, graphic design, and communication — turning raw numbers into stories that drive decisions.

A Brief History

1786 — William Playfair invents the bar chart and the line graph in The Commercial and Political Atlas
1801 — Playfair introduces the pie chart in Statistical Breviary
1854 — John Snow maps cholera cases in London, demonstrating the power of spatial data visualisation
1858 — Florence Nightingale creates the polar area diagram (coxcomb chart) to advocate for military hospital reform
1869 — Charles Joseph Minard publishes his famous flow map of Napoleon's Russian campaign — often called the greatest statistical graphic ever drawn
1967 — Jacques Bertin publishes Semiology of Graphics, establishing a theoretical foundation for visual encoding
1977 — John Tukey publishes Exploratory Data Analysis, formalising EDA as a discipline
1983 — Edward Tufte publishes The Visual Display of Quantitative Information, a landmark work on chart design
2005 — Hans Rosling's Gapminder TED talk brings animated scatter plots to a global audience
2010s — Interactive visualisation libraries (D3.js, Plotly, Bokeh) democratise web-based charting
Today — Data visualisation is a core skill in data science, journalism, business intelligence, and public health

Why Visualise Data?

1. The Human Visual System is Powerful

Humans process visual information roughly 60,000 times faster than text. A well-designed chart can communicate in seconds what a table of numbers takes minutes to parse.

2. Discover Patterns and Relationships

Visualisation makes it easy to spot trends, clusters, correlations, and anomalies that might be invisible in raw data.

3. Communicate Findings Effectively

A clear visualisation bridges the gap between technical analysis and decision-makers who may not be fluent in statistics or code.

4. Facilitate Exploration

During exploratory data analysis (EDA), quick plots help you form hypotheses, check distributions, and guide your next steps.

The Data Visualisation Pipeline

Stage	Description
Acquire	Obtain data from databases, APIs, CSV files, or web scraping
Parse	Clean and structure data so it is ready for plotting
Filter	Focus on the subset of data relevant to your question
Mine	Apply statistics or aggregations to extract key metrics
Represent	Choose an appropriate visual encoding (chart type, colour, shape)
Refine	Improve aesthetics, remove clutter, add labels and annotations
Interact	Add tooltips, zoom, filtering, or animation for exploration

Types of Data

Understanding what type of data you have determines which charts are appropriate.

Quantitative (Numerical) Data

Data that can be measured and expressed as numbers.

Sub-type	Examples	Suitable Charts
Continuous	Temperature, height, revenue	Histograms, line plots, scatter plots
Discrete	Number of orders, count of students	Bar charts, dot plots

Categorical (Qualitative) Data

Data that represents groups or labels.

Sub-type	Examples	Suitable Charts
Nominal	Country, colour, product category	Bar charts, pie charts, treemaps
Ordinal	Education level, satisfaction rating	Bar charts (ordered), heatmaps

Temporal Data

Data that involves time — dates, timestamps, durations. Best shown with line plots, area charts, or timeline visualisations.

Geospatial Data

Data tied to locations — latitude/longitude, postcodes, country codes. Best shown with maps, choropleths, and bubble maps.

Key Terminology

Term	Definition
Mark	The geometric element representing data — a point, line, bar, or area
Channel	The visual property used to encode data — position, length, colour, size, angle
Encoding	The mapping of a data variable to a visual channel
Scale	A function mapping data values to visual values (e.g., linear, logarithmic)
Legend	A key explaining what colours, sizes, or shapes represent
Annotation	Text or markers added to highlight specific data points
Aspect ratio	The width-to-height ratio of a chart, which affects how trends appear

Static vs Interactive Visualisation

Feature	Static	Interactive
Format	PNG, PDF, print	HTML, dashboard, web app
Exploration	Fixed view	Zoom, pan, filter, hover
Tools	Matplotlib, Seaborn	Plotly, Bokeh, D3.js, Altair
Best for	Reports, papers, presentations	Dashboards, exploration, storytelling
Audience	Readers of a final report	Analysts who want to drill down

The Python Visualisation Ecosystem

Library	Type	Strengths
`Matplotlib`	Static	Fine-grained control, publication-quality figures
`Seaborn`	Static	Statistical plots with minimal code, built on Matplotlib
`Plotly`	Interactive	Rich interactivity, easy to embed in web apps
`Bokeh`	Interactive	Streaming and real-time data, server-backed dashboards
`Altair`	Declarative	Concise grammar-of-graphics API, Vega-Lite backend
`Pandas`	Static	Quick plots directly from DataFrames
`Folium`	Maps	Leaflet-based interactive maps
`Streamlit`	Dashboards	Turn Python scripts into shareable web apps

Anscombe's Quartet — Why Visualisation Matters

In 1973, statistician Francis Anscombe constructed four datasets that have nearly identical summary statistics (mean, variance, correlation, regression line) yet look completely different when plotted. This is a powerful demonstration of why you should always visualise your data before relying solely on summary statistics.

import seaborn as sns
import matplotlib.pyplot as plt

# Load Anscombe's quartet
df = sns.load_dataset("anscombe")

# Plot all four datasets
g = sns.lmplot(x="x", y="y", col="dataset",
               data=df, col_wrap=2,
               height=4, aspect=1)
g.set_titles("Dataset {col_name}")
plt.suptitle("Anscombe's Quartet", y=1.02)
plt.show()

Tip: If two datasets have the same mean, standard deviation, and correlation, they are NOT necessarily the same. Always plot before you model.

Summary

Data visualisation transforms raw data into visual stories. It leverages the power of human perception to reveal patterns, communicate findings, and guide decisions. The field has evolved from hand-drawn charts in the 18th century to interactive web-based dashboards today. In this course, you will learn the principles of effective visual design, master the major Python visualisation libraries — Matplotlib, Seaborn, Plotly, and Pandas — and develop the skills to build dashboards, geospatial maps, and data-driven narratives.

What is Data Visualisation

What is Data Visualisation

A Brief History

Why Visualise Data?

1. The Human Visual System is Powerful

2. Discover Patterns and Relationships

3. Communicate Findings Effectively

4. Facilitate Exploration

The Data Visualisation Pipeline

Types of Data

Quantitative (Numerical) Data

Categorical (Qualitative) Data

Temporal Data

Geospatial Data

Key Terminology

Static vs Interactive Visualisation

The Python Visualisation Ecosystem

Anscombe's Quartet — Why Visualisation Matters

Summary

More in Data Science