You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Decision trees are one of the most intuitive and widely used machine learning algorithms. They model decisions as a tree-like structure of rules, making them easy to interpret and visualise. Random forests extend decision trees into a powerful ensemble method that reduces overfitting and improves accuracy.
A decision tree is a flowchart-like structure where:
The algorithm recursively splits the data by choosing the feature and threshold that best separates the classes (or reduces prediction error). At each node, it asks: "Which feature split produces the purest subgroups?"
| Criterion | Used For | Description |
|---|---|---|
| Gini Impurity | Classification | Measures the probability of misclassifying a randomly chosen element |
| Entropy (Information Gain) | Classification | Measures the amount of information gained by a split |
| MSE (Mean Squared Error) | Regression | Minimises the variance of target values in each split |
Gini = 1 - sum(p_i^2) for each class i
A Gini of 0 means the node is pure (all samples belong to one class).
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
# Load data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
iris.data, iris.target, test_size=0.2, random_state=42
)
# Train
tree = DecisionTreeClassifier(max_depth=3, random_state=42)
tree.fit(X_train, y_train)
# Visualise
plt.figure(figsize=(16, 8))
plot_tree(tree, feature_names=iris.feature_names,
class_names=iris.target_names, filled=True, rounded=True)
plt.title('Decision Tree — Iris Dataset')
plt.show()
print(f"Accuracy: {tree.score(X_test, y_test):.2f}")
| Advantages | Disadvantages |
|---|---|
| Easy to understand and interpret | Prone to overfitting without pruning |
| No feature scaling required | Sensitive to small changes in data |
| Handles both numerical and categorical data | Can create biased trees with imbalanced data |
| Can capture non-linear relationships | Greedy algorithm — not globally optimal |
| Technique | Parameter | Description |
|---|---|---|
| Max depth | max_depth | Limits how deep the tree can grow |
| Min samples split | min_samples_split | Minimum samples required to split a node |
| Min samples leaf | min_samples_leaf | Minimum samples required in a leaf node |
| Max features | max_features | Number of features considered for each split |
| Pruning | ccp_alpha | Cost-complexity pruning — removes branches that provide little benefit |
A random forest is an ensemble of many decision trees. Each tree is trained on a different random subset of the data and features, and the final prediction is determined by voting (classification) or averaging (regression).
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.