You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Transfer learning is one of the most powerful and practical techniques in deep learning. Instead of training a model from scratch, you start with a model that has already been trained on a large dataset and fine-tune it for your specific task. This dramatically reduces the amount of data and compute needed.
| Challenge | Without Transfer Learning | With Transfer Learning |
|---|---|---|
| Data requirements | Need millions of labelled examples | A few hundred or thousand examples may suffice |
| Training time | Days or weeks on GPUs | Hours or less |
| Compute cost | Extremely expensive | Affordable |
| Performance | Often suboptimal with limited data | Often state-of-the-art even with small datasets |
Deep learning models learn hierarchical features. In a CNN trained on ImageNet:
The universal features learned in early and middle layers are useful for virtually any vision task. Transfer learning reuses these features.
| Strategy | When to Use | What to Do |
|---|---|---|
| Feature extraction | Small dataset, similar to pre-training data | Freeze all pre-trained layers; only train a new classifier head |
| Fine-tuning (partial) | Medium dataset, somewhat different from pre-training data | Freeze early layers; fine-tune later layers and the classifier |
| Fine-tuning (full) | Large dataset, very different from pre-training data | Fine-tune the entire network with a small learning rate |
import torch
import torch.nn as nn
from torchvision import models
# Load a pre-trained ResNet-18
model = models.resnet18(weights=models.ResNet18_Weights.IMAGENET1K_V1)
# Freeze all parameters
for param in model.parameters():
param.requires_grad = False
# Replace the final classifier layer
num_classes = 5
model.fc = nn.Linear(model.fc.in_features, num_classes)
# Only the new fc layer will be trained
print(f"Trainable parameters: {sum(p.numel() for p in model.parameters() if p.requires_grad)}")
import torch
import torch.nn as nn
from torchvision import models
# Load pre-trained model
model = models.resnet18(weights=models.ResNet18_Weights.IMAGENET1K_V1)
# Freeze early layers (layer1, layer2)
for name, param in model.named_parameters():
if 'layer1' in name or 'layer2' in name:
param.requires_grad = False
# Replace the classifier
model.fc = nn.Linear(model.fc.in_features, 5)
# Use different learning rates
optimizer = torch.optim.Adam([
{'params': model.layer3.parameters(), 'lr': 1e-4},
{'params': model.layer4.parameters(), 'lr': 1e-4},
{'params': model.fc.parameters(), 'lr': 1e-3},
])
Tip: Use a lower learning rate for pre-trained layers to preserve the learned features, and a higher learning rate for the new classifier head.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.