You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Scalability is the ability of a system to handle increasing load by adding resources. Load balancing is the technique of distributing incoming requests across multiple servers. Together, they form the foundation of any system that needs to grow beyond a single machine.
There are two fundamental approaches to scaling:
Vertical Scaling Horizontal Scaling
(Scale Up) (Scale Out)
┌───────────────┐ ┌──────┐ ┌──────┐ ┌──────┐
│ │ │Server│ │Server│ │Server│
│ BIG SERVER │ │ 1 │ │ 2 │ │ 3 │
│ (more CPU, │ └──────┘ └──────┘ └──────┘
│ more RAM, │ ▲ ▲ ▲
│ more disk) │ └───────┼───────┘
│ │ │
└───────────────┘ ┌──────────────┐
│Load Balancer │
└──────────────┘
| Aspect | Vertical Scaling | Horizontal Scaling |
|---|---|---|
| Approach | Bigger machine | More machines |
| Complexity | Low | Higher (distributed state) |
| Cost curve | Exponential | Linear |
| Downtime risk | Single point of failure | Redundancy built in |
| Upper limit | Hardware limits | Practically unlimited |
| State management | Simple (single node) | Requires external state |
Tip: Most production systems use horizontal scaling as their primary strategy, with vertical scaling for components that are difficult to distribute (e.g. certain databases).
A load balancer sits between clients and servers, distributing requests to prevent any single server from becoming overwhelmed.
┌──────────────────────────────────────────────────────────┐
│ OSI Model (Simplified) │
├──────────┬───────────────────────────────────────────────┤
│ Layer 7 │ Application (HTTP, HTTPS, WebSocket) │
│ Layer 4 │ Transport (TCP, UDP) │
│ Layer 3 │ Network (IP) │
└──────────┴───────────────────────────────────────────────┘
| Feature | L4 Load Balancer | L7 Load Balancer |
|---|---|---|
| Operates at | TCP/UDP level | HTTP/HTTPS level |
| Inspects | IP, port, TCP headers | URL, headers, cookies, body |
| Speed | Very fast | Slightly slower |
| Content routing | No | Yes (route by URL, header) |
| SSL termination | Pass-through or terminate | Typically terminates |
| Use case | Simple distribution | Smart routing, A/B testing |
| Examples | AWS NLB, HAProxy (TCP) | AWS ALB, NGINX, Envoy |
Requests are distributed sequentially to each server in turn.
Request 1 ──▶ Server A
Request 2 ──▶ Server B
Request 3 ──▶ Server C
Request 4 ──▶ Server A (cycles back)
Request 5 ──▶ Server B
Pros: Simple, even distribution when servers are identical. Cons: Ignores server load and capacity differences.
Servers with more capacity receive more requests.
Weights: A=3, B=2, C=1
Request 1 ──▶ Server A
Request 2 ──▶ Server A
Request 3 ──▶ Server A
Request 4 ──▶ Server B
Request 5 ──▶ Server B
Request 6 ──▶ Server C
Requests go to the server with the fewest active connections.
Server A: 12 connections ┐
Server B: 5 connections │──▶ Request goes to Server C
Server C: 3 connections ┘ (fewest connections)
Best for: Requests with varying processing times (long-lived connections).
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.