Building Networks with PyTorch

PyTorch is the most popular deep learning framework in research and is increasingly adopted in production. Developed by Meta AI, it provides a flexible, Pythonic interface for building, training, and deploying neural networks.

Why PyTorch?

Feature	Description
Dynamic computation graphs	Build-by-run approach — the graph is constructed on the fly during execution
Pythonic API	Feels like writing standard Python; easy to debug with print statements and breakpoints
Strong GPU support	Seamless tensor operations on GPU with `.to('cuda')`
Rich ecosystem	torchvision, torchaudio, torchtext, Hugging Face integration
Research dominance	Used in the majority of NeurIPS, ICML, and ICLR papers
Production ready	TorchScript, TorchServe, ONNX export for deployment

Tensors — The Core Data Structure

Tensors are the fundamental data structure in PyTorch — multi-dimensional arrays similar to NumPy arrays but with GPU support and automatic differentiation.

import torch

# Create tensors
x = torch.tensor([1.0, 2.0, 3.0])           # From a list
y = torch.zeros(3, 4)                         # 3x4 tensor of zeros
z = torch.randn(3, 4)                         # 3x4 tensor of random normal values
w = torch.ones(2, 3, requires_grad=True)      # Track gradients for this tensor

# Tensor properties
print(z.shape)      # torch.Size([3, 4])
print(z.dtype)      # torch.float32
print(z.device)     # cpu

# Move to GPU (if available)
if torch.cuda.is_available():
    z_gpu = z.to('cuda')
    print(z_gpu.device)  # cuda:0

Key Tensor Operations

Operation	Example	Description
Reshape	`x.view(2, 3)` or `x.reshape(2, 3)`	Change the shape without changing data
Matrix multiply	`torch.matmul(a, b)` or `a @ b`	Matrix multiplication
Element-wise	`a * b`, `a + b`	Element-wise operations
Sum	`x.sum(dim=0)`	Sum along a dimension
Mean	`x.mean(dim=1)`	Mean along a dimension
Concatenate	`torch.cat([a, b], dim=0)`	Join tensors along a dimension
Stack	`torch.stack([a, b], dim=0)`	Stack tensors along a new dimension

Building Models with nn.Module

Every PyTorch model inherits from nn.Module. You define layers in __init__ and the data flow in forward.

import torch
import torch.nn as nn

class BinaryClassifier(nn.Module):
    def __init__(self, input_dim):
        super().__init__()
        self.layer1 = nn.Linear(input_dim, 64)
        self.layer2 = nn.Linear(64, 32)
        self.output = nn.Linear(32, 1)
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(0.3)

    def forward(self, x):
        x = self.relu(self.layer1(x))
        x = self.dropout(x)
        x = self.relu(self.layer2(x))
        x = self.dropout(x)
        x = self.output(x)
        return x

model = BinaryClassifier(input_dim=20)
print(model)

Using nn.Sequential

For simple architectures, you can use nn.Sequential instead of defining a full class:

model = nn.Sequential(
    nn.Linear(784, 256),
    nn.ReLU(),
    nn.Dropout(0.3),
    nn.Linear(256, 128),
    nn.ReLU(),
    nn.Dropout(0.3),
    nn.Linear(128, 10),
)

Datasets and DataLoaders

PyTorch provides Dataset and DataLoader classes for efficient data loading and batching.

from torch.utils.data import Dataset, DataLoader

class CustomDataset(Dataset):
    def __init__(self, features, labels):
        self.features = torch.tensor(features, dtype=torch.float32)
        self.labels = torch.tensor(labels, dtype=torch.float32)

    def __len__(self):
        return len(self.features)

    def __getitem__(self, idx):
        return self.features[idx], self.labels[idx]

# Create dataset and dataloader
dataset = CustomDataset(X_train, y_train)
dataloader = DataLoader(dataset, batch_size=32, shuffle=True, num_workers=2)

# Iterate over batches
for batch_features, batch_labels in dataloader:
    # Process each batch
    pass

Built-in Datasets

from torchvision import datasets, transforms

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,)),
])

train_dataset = datasets.MNIST(
    root='./data', train=True, download=True, transform=transform
)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)

Building Networks with PyTorch

Building Networks with PyTorch

Why PyTorch?

Tensors — The Core Data Structure

Key Tensor Operations

Building Models with nn.Module

Using nn.Sequential

Datasets and DataLoaders

Built-in Datasets

The Training Loop

More in Data Science