You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Choosing the right chart type is one of the most important decisions in data visualisation. The wrong chart can mislead, confuse, or simply fail to communicate the insight. This lesson covers the most common chart types, when to use each one, and common mistakes to avoid.
Ask yourself these questions before choosing a chart:
| Purpose | Chart Types |
|---|---|
| Comparison | Bar chart, grouped bar, dot plot, lollipop chart |
| Distribution | Histogram, box plot, violin plot, density plot, strip plot |
| Composition | Stacked bar, pie chart (use sparingly), treemap, waffle chart |
| Trend over time | Line plot, area chart, sparkline |
| Relationship | Scatter plot, bubble chart, heatmap |
| Part of a whole | Pie chart, donut chart, stacked area chart |
| Ranking | Horizontal bar chart, bump chart |
| Geospatial | Choropleth map, bubble map, cartogram |
The most versatile chart for comparing values across categories.
import matplotlib.pyplot as plt
categories = ["Python", "R", "SQL", "Julia", "Scala"]
popularity = [68, 18, 42, 5, 3]
fig, ax = plt.subplots(figsize=(8, 5))
ax.barh(categories, popularity, color="#4C72B0")
ax.set_xlabel("Popularity (%)")
ax.set_title("Programming Language Popularity in Data Science")
ax.spines[["top", "right"]].set_visible(False)
for i, v in enumerate(popularity):
ax.text(v + 1, i, str(v), va="center")
plt.tight_layout()
plt.show()
Tip: Use horizontal bar charts when category labels are long. Always start the axis at zero for bar charts.
Compare multiple measures across the same categories.
import matplotlib.pyplot as plt
import numpy as np
categories = ["Q1", "Q2", "Q3", "Q4"]
product_a = [120, 135, 148, 162]
product_b = [98, 115, 132, 141]
x = np.arange(len(categories))
width = 0.35
fig, ax = plt.subplots(figsize=(8, 5))
ax.bar(x - width/2, product_a, width, label="Product A", color="#4C72B0")
ax.bar(x + width/2, product_b, width, label="Product B", color="#DD8452")
ax.set_xticks(x)
ax.set_xticklabels(categories)
ax.set_ylabel("Revenue (GBP thousands)")
ax.set_title("Quarterly Revenue by Product")
ax.legend()
ax.spines[["top", "right"]].set_visible(False)
plt.tight_layout()
plt.show()
Shows the frequency distribution of a single numerical variable.
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(42)
data = np.random.normal(loc=170, scale=10, size=1000)
fig, ax = plt.subplots(figsize=(8, 5))
ax.hist(data, bins=30, color="#4C72B0", edgecolor="white")
ax.set_xlabel("Height (cm)")
ax.set_ylabel("Frequency")
ax.set_title("Distribution of Heights")
ax.spines[["top", "right"]].set_visible(False)
plt.tight_layout()
plt.show()
Summarises the distribution through quartiles and highlights outliers.
| Component | Meaning |
|---|---|
| Box | Interquartile range (IQR) — Q1 to Q3 |
| Line inside box | Median (Q2) |
| Whiskers | Extend to the furthest data point within 1.5 * IQR |
| Dots beyond whiskers | Outliers |
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(42)
data = [np.random.normal(50, 10, 200),
np.random.normal(60, 15, 200),
np.random.normal(45, 8, 200)]
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.