Skip to content

You are viewing a free preview of this lesson.

Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.

What is Data Visualisation

What is Data Visualisation

Data visualisation is the graphical representation of information and data. By using visual elements like charts, graphs, and maps, data visualisation tools provide an accessible way to see and understand trends, outliers, and patterns in data. It sits at the intersection of data science, graphic design, and communication — turning raw numbers into stories that drive decisions.


A Brief History

  • 1786 — William Playfair invents the bar chart and the line graph in The Commercial and Political Atlas
  • 1801 — Playfair introduces the pie chart in Statistical Breviary
  • 1854 — John Snow maps cholera cases in London, demonstrating the power of spatial data visualisation
  • 1858 — Florence Nightingale creates the polar area diagram (coxcomb chart) to advocate for military hospital reform
  • 1869 — Charles Joseph Minard publishes his famous flow map of Napoleon's Russian campaign — often called the greatest statistical graphic ever drawn
  • 1967 — Jacques Bertin publishes Semiology of Graphics, establishing a theoretical foundation for visual encoding
  • 1977 — John Tukey publishes Exploratory Data Analysis, formalising EDA as a discipline
  • 1983 — Edward Tufte publishes The Visual Display of Quantitative Information, a landmark work on chart design
  • 2005 — Hans Rosling's Gapminder TED talk brings animated scatter plots to a global audience
  • 2010s — Interactive visualisation libraries (D3.js, Plotly, Bokeh) democratise web-based charting
  • Today — Data visualisation is a core skill in data science, journalism, business intelligence, and public health

Why Visualise Data?

1. The Human Visual System is Powerful

Humans process visual information roughly 60,000 times faster than text. A well-designed chart can communicate in seconds what a table of numbers takes minutes to parse.

2. Discover Patterns and Relationships

Visualisation makes it easy to spot trends, clusters, correlations, and anomalies that might be invisible in raw data.

3. Communicate Findings Effectively

A clear visualisation bridges the gap between technical analysis and decision-makers who may not be fluent in statistics or code.

4. Facilitate Exploration

During exploratory data analysis (EDA), quick plots help you form hypotheses, check distributions, and guide your next steps.


The Data Visualisation Pipeline

Stage Description
Acquire Obtain data from databases, APIs, CSV files, or web scraping
Parse Clean and structure data so it is ready for plotting
Filter Focus on the subset of data relevant to your question
Mine Apply statistics or aggregations to extract key metrics
Represent Choose an appropriate visual encoding (chart type, colour, shape)
Refine Improve aesthetics, remove clutter, add labels and annotations
Interact Add tooltips, zoom, filtering, or animation for exploration

Types of Data

Understanding what type of data you have determines which charts are appropriate.

Quantitative (Numerical) Data

Data that can be measured and expressed as numbers.

Sub-type Examples Suitable Charts
Continuous Temperature, height, revenue Histograms, line plots, scatter plots
Discrete Number of orders, count of students Bar charts, dot plots

Categorical (Qualitative) Data

Data that represents groups or labels.

Sub-type Examples Suitable Charts
Nominal Country, colour, product category Bar charts, pie charts, treemaps
Ordinal Education level, satisfaction rating Bar charts (ordered), heatmaps

Temporal Data

Data that involves time — dates, timestamps, durations. Best shown with line plots, area charts, or timeline visualisations.

Geospatial Data

Data tied to locations — latitude/longitude, postcodes, country codes. Best shown with maps, choropleths, and bubble maps.


Key Terminology

Term Definition
Mark The geometric element representing data — a point, line, bar, or area
Channel The visual property used to encode data — position, length, colour, size, angle
Encoding The mapping of a data variable to a visual channel
Scale A function mapping data values to visual values (e.g., linear, logarithmic)
Legend A key explaining what colours, sizes, or shapes represent
Annotation Text or markers added to highlight specific data points
Aspect ratio The width-to-height ratio of a chart, which affects how trends appear

Static vs Interactive Visualisation

Feature Static Interactive
Format PNG, PDF, print HTML, dashboard, web app
Exploration Fixed view Zoom, pan, filter, hover
Tools Matplotlib, Seaborn Plotly, Bokeh, D3.js, Altair
Best for Reports, papers, presentations Dashboards, exploration, storytelling
Audience Readers of a final report Analysts who want to drill down

The Python Visualisation Ecosystem

Library Type Strengths
Matplotlib Static Fine-grained control, publication-quality figures
Seaborn Static Statistical plots with minimal code, built on Matplotlib
Plotly Interactive Rich interactivity, easy to embed in web apps
Bokeh Interactive Streaming and real-time data, server-backed dashboards
Altair Declarative Concise grammar-of-graphics API, Vega-Lite backend
Pandas Static Quick plots directly from DataFrames
Folium Maps Leaflet-based interactive maps
Streamlit Dashboards Turn Python scripts into shareable web apps

Anscombe's Quartet — Why Visualisation Matters

In 1973, statistician Francis Anscombe constructed four datasets that have nearly identical summary statistics (mean, variance, correlation, regression line) yet look completely different when plotted. This is a powerful demonstration of why you should always visualise your data before relying solely on summary statistics.

import seaborn as sns
import matplotlib.pyplot as plt

# Load Anscombe's quartet
df = sns.load_dataset("anscombe")

# Plot all four datasets
g = sns.lmplot(x="x", y="y", col="dataset",
               data=df, col_wrap=2,
               height=4, aspect=1)
g.set_titles("Dataset {col_name}")
plt.suptitle("Anscombe's Quartet", y=1.02)
plt.show()

Tip: If two datasets have the same mean, standard deviation, and correlation, they are NOT necessarily the same. Always plot before you model.


Summary

Data visualisation transforms raw data into visual stories. It leverages the power of human perception to reveal patterns, communicate findings, and guide decisions. The field has evolved from hand-drawn charts in the 18th century to interactive web-based dashboards today. In this course, you will learn the principles of effective visual design, master the major Python visualisation libraries — Matplotlib, Seaborn, Plotly, and Pandas — and develop the skills to build dashboards, geospatial maps, and data-driven narratives.