You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Kubernetes Architecture Deep Dive
Kubernetes Architecture Deep Dive
Understanding Kubernetes architecture is foundational to running reliable production clusters. This lesson explores the control plane components, worker node internals, networking model, and cluster topologies that underpin every Kubernetes deployment.
High-Level Architecture
A Kubernetes cluster consists of two layers: the control plane and the worker nodes.
┌──────────────────────────────────────────────────────────────┐
│ CONTROL PLANE │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────────┐ │
│ │ kube-apiserver│ │ etcd │ │ kube-scheduler │ │
│ │ │ │ (key-value │ │ │ │
│ │ │ │ store) │ │ │ │
│ └──────────────┘ └──────────────┘ └────────────────────┘ │
│ ┌──────────────────────┐ ┌──────────────────────────────┐ │
│ │ kube-controller-manager│ │ cloud-controller-manager │ │
│ └──────────────────────┘ └──────────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
│ Worker Node 1 │ │ Worker Node 2 │ │ Worker Node 3 │
│ ┌──────────┐ │ │ ┌──────────┐ │ │ ┌──────────┐ │
│ │ kubelet │ │ │ │ kubelet │ │ │ │ kubelet │ │
│ ├──────────┤ │ │ ├──────────┤ │ │ ├──────────┤ │
│ │kube-proxy│ │ │ │kube-proxy│ │ │ │kube-proxy│ │
│ ├──────────┤ │ │ ├──────────┤ │ │ ├──────────┤ │
│ │ Container│ │ │ │ Container│ │ │ │ Container│ │
│ │ Runtime │ │ │ │ Runtime │ │ │ │ Runtime │ │
│ └──────────┘ │ │ └──────────┘ │ │ └──────────┘ │
└────────────────┘ └────────────────┘ └────────────────┘
Control Plane Components
kube-apiserver
The API server is the front door to the cluster. Every interaction — kubectl commands, controller actions, kubelet heartbeats — goes through the API server.
# All kubectl commands communicate with the API server
kubectl get pods
# GET https://<api-server>:6443/api/v1/namespaces/default/pods
kubectl apply -f deployment.yaml
# POST/PUT https://<api-server>:6443/apis/apps/v1/namespaces/default/deployments
Key characteristics:
- RESTful API — resources are manipulated via standard HTTP verbs
- Authentication and authorisation — supports certificates, tokens, OIDC
- Admission control — mutating and validating webhooks intercept requests
- Horizontally scalable — multiple instances behind a load balancer
etcd
etcd is a distributed key-value store that holds all cluster state — every resource, every config, every secret.
| Property | Detail |
|---|---|
| Consensus | Raft protocol (requires quorum) |
| Recommended size | 3 or 5 members (odd number for quorum) |
| Data stored | All Kubernetes objects (serialised as protobuf) |
| Backup frequency | Every 30 minutes minimum in production |
# Backup etcd
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-snapshot.db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
# Verify the snapshot
ETCDCTL_API=3 etcdctl snapshot status /backup/etcd-snapshot.db --write-table
Production tip: Always run etcd on dedicated, SSD-backed nodes. etcd performance directly impacts cluster responsiveness.
kube-scheduler
The scheduler watches for unscheduled pods and assigns them to nodes based on:
- Filtering — eliminate nodes that cannot run the pod (resource limits, taints, affinity)
- Scoring — rank remaining nodes by preference (spread, resource balance)
- Binding — assign the pod to the highest-scoring node
# Example: Influence scheduling with node affinity
apiVersion: v1
kind: Pod
metadata:
name: gpu-workload
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: gpu-type
operator: In
values:
- nvidia-a100
containers:
- name: trainer
image: ml-training:latest
resources:
limits:
nvidia.com/gpu: 1
kube-controller-manager
A single binary that runs multiple controllers in loops:
| Controller | Responsibility |
|---|---|
| Deployment controller | Manages ReplicaSets for Deployments |
| ReplicaSet controller | Ensures desired pod count matches actual |
| Node controller | Detects and responds to node failures |
| Job controller | Creates pods for Job completions |
| Endpoint controller | Populates Service endpoint objects |
| ServiceAccount controller | Creates default ServiceAccounts in new namespaces |
Each controller follows the reconciliation loop pattern: observe the current state, compare to the desired state, and take action to converge.
cloud-controller-manager
Handles cloud-specific operations such as provisioning load balancers, managing node lifecycle, and configuring routes. It is specific to your cloud provider (AWS, GCP, Azure).
Worker Node Components
kubelet
The kubelet is the primary agent on each node. It:
- Registers the node with the API server
- Watches for pod assignments
- Manages container lifecycle via the Container Runtime Interface (CRI)
- Reports node and pod status back to the control plane
- Runs liveness, readiness, and startup probes
# Check kubelet status on a node
systemctl status kubelet
# View kubelet logs
journalctl -u kubelet -f
kube-proxy
kube-proxy maintains network rules on each node, enabling Service abstraction:
| Mode | How It Works | Performance |
|---|---|---|
| iptables | Creates iptables rules for each Service endpoint | Good (default) |
| IPVS | Uses Linux IPVS for load balancing | Better at scale |
| nftables | Uses nftables (newer kernels) | Modern |
Container Runtime
Kubernetes uses the Container Runtime Interface (CRI) to support multiple runtimes:
- containerd — the most common production runtime
- CRI-O — lightweight, designed specifically for Kubernetes
- gVisor / Kata Containers — for enhanced isolation
Kubernetes Networking Model
Kubernetes enforces a flat networking model with three fundamental rules:
- Every pod gets its own IP address
- Pods can communicate with any other pod without NAT
- Agents on a node can communicate with all pods on that node
Pod A (10.244.1.5) ──────────────────▶ Pod B (10.244.2.8)
Node 1 Node 2
│ │
└──────── CNI Plugin (Overlay) ─────────┘
CNI Plugins
| Plugin | Type | Features | Best For |
|---|---|---|---|
| Calico | L3 | NetworkPolicy, BGP, eBPF | Production, security |
| Cilium | eBPF | NetworkPolicy, observability, encryption | Modern kernels |
| Flannel | Overlay | Simple VXLAN overlay | Simple clusters |
| Weave | Overlay | Encryption, multicast | Ease of use |
Cluster Topologies
Single Control Plane (Development)
┌─────────────────┐
│ Control Plane │
│ (single node) │
└────────┬────────┘
│
┌────┼────┐
▼ ▼ ▼
Node Node Node
Highly Available Control Plane (Production)
┌──────────────────────────────────────┐
│ Load Balancer │
└───────┬──────────┬──────────┬────────┘
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ CP #1 │ │ CP #2 │ │ CP #3 │
│ API + │ │ API + │ │ API + │
│ etcd │ │ etcd │ │ etcd │
└──────────┘ └──────────┘ └──────────┘
│ │ │
┌────┴────┬─────┴────┬────┴────┐
▼ ▼ ▼ ▼
Node Node Node Node
Production recommendation: Use 3 control plane nodes with a load balancer. This tolerates the loss of 1 control plane node.
Inspecting the Cluster
# View cluster component status
kubectl get componentstatuses
# Inspect nodes
kubectl get nodes -o wide
# View all system pods
kubectl get pods -n kube-system
# Describe a node in detail
kubectl describe node <node-name>
# Check cluster info
kubectl cluster-info
Summary
- The control plane consists of the API server, etcd, scheduler, and controller manager.
- etcd stores all cluster state and should be backed up regularly.
- The kubelet manages containers on each node via the CRI.
- kube-proxy implements Service networking using iptables, IPVS, or nftables.
- Kubernetes networking is flat — every pod gets a unique IP and can reach any other pod without NAT.
- CNI plugins (Calico, Cilium, Flannel) implement the networking model.
- Production clusters use highly available control planes with 3 or 5 nodes.