In the previous chapter, you created your first GKE cluster and watched Kubernetes bring Pods to life. But what actually happened inside? How did Kubernetes know where to run your Pods? How does it keep them running when things go wrong? And what is GKE doing behind the scenes that you can't see?
The answer lies in Kubernetes architecture — the engine room. This chapter is where you stop being a user and start becoming an administrator. Copy-pasters deploy YAML; good administrators understand the machinery.
Analogy: The Modern Restaurant Kitchen
Imagine a high-end restaurant during dinner rush. Orders pour in, dishes must meet exact specifications, and every station coordinates seamlessly. Kubernetes works like that kitchen: a kitchen management team (the Control Plane) makes decisions and tracks orders, while cooking stations (the Worker Nodes) do the actual preparation. Each Kubernetes component maps to a kitchen role throughout this chapter, giving you a tangible mental model for every technical detail.
2.2.1 The Control Plane — Kubernetes' Brain
The Control Plane makes every significant decision: what runs where, what should be running, and the current state of the entire system. In a restaurant, this is the kitchen management team who coordinate from a central position.
The Control Plane consists of four core components on dedicated master nodes:
- API Server — the front door for all communication
- etcd — the source of truth for all cluster data
- Scheduler — the placement engine
- Controller Manager — the eternal watchdog
There's also a Cloud Controller Manager on cloud-hosted clusters for cloud-provider integrations like load balancers.
The critical principle is separation of control and data planes. The Control Plane runs only cluster-management software — never user workloads. This ensures that even when your applications misbehave, the cluster's ability to manage itself remains intact. It's why a restaurant's management office is separate from the cooking floor: a grease fire at one station shouldn't shut down the ordering system.
For high availability, production clusters run the Control Plane across multiple masters (typically three or five). If one master fails, the others continue.
GKE Note: On GKE, the Control Plane is entirely managed by Google. You don't see master nodes, manage etcd, or SSH into the API Server. Google runs it across multiple zones by default with automatic upgrades. You're buying a managed brain, not assembling one.
2.2.2 The API Server — The Front Door
Analogy: The Expediter Window
In a kitchen, every communication between the dining room and cooking stations passes through the expediter window. Nothing goes directly from a server to a line cook. The API Server is that window — every cluster interaction, whether from kubectl, a controller, the Scheduler, or a kubelet, flows through it.
The API Server is the only component that talks directly to etcd. When the Scheduler picks a Node, it tells the API Server, which validates and persists the decision. When a kubelet reports status, the API Server updates etcd. This centralized gatekeeping ensures consistency and enables the authentication, authorization, and admission control layers that protect your cluster.
Every Kubernetes object is exposed through a RESTful API where each type is a resource. A Deployment lives at /apis/apps/v1/namespaces/{namespace}/deployments. When you run kubectl apply, kubectl converts your YAML into a JSON HTTP POST to the API Server.
Every request passes through three layers before reaching etcd: Authentication (who are you?), Authorization (what can you do?), and Admission Control (should this be allowed or modified?). The aggregation layer also extends the API with custom resources, enabling tools like cert-manager to add their own endpoints.
⚠️ Common Misconception: "The API Server is just a REST API gateway." It handles watch notifications, schema validation, resource versioning, and the entire request pipeline. It's the beating heart of the cluster.
2.2.3 etcd — The Source of Truth
Analogy: The Order Ticket Rail
In a restaurant, every active order is written on a ticket pinned to a central rail. Everyone — cooks, expediters, managers — can look at this rail and know exactly what's happening. If it were lost, the kitchen would have no idea what anyone ordered. etcd is that ticket rail — the single, authoritative record of everything in your cluster.
etcd is a distributed, consistent key-value store holding all cluster state: every Pod spec, Service definition, Secret, and Node status. What makes etcd special is the Raft consensus algorithm. In a multi-master cluster, etcd runs as three or five members, and every write must be acknowledged by a majority (a "quorum"). A three-node cluster tolerates one failure; a five-node cluster tolerates two.
Why is etcd backup the most critical operational task? Because everything is in it. Lose etcd without a backup and you lose all cluster state. Running Pods may continue, but Kubernetes has no record of what should exist. It's like losing the order ticket rail during dinner service: food might still be cooking, but nobody knows what's supposed to be happening.
GKE Note: You never interact with etcd on GKE. Google manages backups, scaling, and recovery — a significant advantage, as etcd administration is among the most error-prone aspects of self-managing Kubernetes.
🛑 PAUSE & RECALL — 3 minutes
Without looking back:
- Why is the API Server the only component that talks directly to etcd?
- In the restaurant analogy, what does etcd represent? What happens if it disappears?
- A three-node etcd cluster tolerates how many failures? A five-node cluster?
Rate your confidence (0–4).
2.2.4 The Scheduler — The Placement Engine
Analogy: The Station Assigner
In a kitchen, someone decides which cook handles which order. The pasta station shouldn't get dessert. The Scheduler is that station assigner — it watches for unassigned Pods and finds the best Node to run them.
The Scheduler uses a two-phase algorithm: filter, then score.
Phase 1: Filter eliminates Nodes that cannot run the Pod: not enough CPU or memory? Doesn't match nodeSelector? Can't tolerate the Pod's taints? Anti-affinity rules blocking placement? If no Nodes pass, the Pod stays Pending.
Phase 2: Score ranks remaining Nodes by desirability, considering resource balance, affinity preferences, topology spread, and custom priorities. The highest-scoring Node wins.
You can customize scheduling with labels/nodeSelector, node affinity/anti-affinity, taints and tolerations (to repel Pods from specific Nodes), and pod affinity/anti-affinity (to co-locate or spread Pods).
GKE Note: GKE extends scheduling with node auto-provisioning. When no suitable Node exists, GKE can automatically create a new node pool. The Scheduler's concept of "placement" expands from "which existing Node" to "what infrastructure should exist."
⚠️ Common Misconception: "The Scheduler picks Nodes randomly." Scheduling is a sophisticated multi-step optimization. Pods stuck in Pending are due to filter-phase failures, not randomness.
2.2.5 The Controller Manager — The Eternal Watchdog
Analogy: The Quality Control Team
In a restaurant, a quality team continuously walks the floor, noticing when a dish doesn't match its order or a station falls behind. Their job is never done. The Controller Manager is that team, and its fundamental mechanism is the control loop.
The Control Loop — Kubernetes' Most Important Pattern
Every controller follows this cycle:
- Observe current state (read from the API Server)
- Compare with desired state (the resource spec)
- Act to reduce the gap
- Repeat indefinitely
Controllers don't just respond to changes — they continuously reconcile. Even when nothing changes, they periodically verify reality matches expectations.
Visual Description:
The Controller Manager runs many controllers, each watching a specific resource type:
| Controller | Function |
|---|---|
| Deployment | Creates and manages ReplicaSets |
| ReplicaSet | Ensures correct Pod count exists |
| StatefulSet | Manages stateful apps with stable identity |
| Node | Monitors Node health, evicts Pods when nodes fail |
| EndpointSlice | Maintains Pod IP lists for Services |
| Job/CronJob | Creates Pods that run to completion or on a schedule |
Each operates independently, watching the API Server and taking action. If a controller crashes, it restarts and picks up where it left off — its state is entirely in etcd.
🤔 TRY BEFORE YOU SEE
You create a Deployment with replicas: 3. Later, you SSH to a worker node and manually delete one container using crictl rm.
Predict what happens step by step. Who notices? What triggers replacement? How long does it take?
Reveal: (1) The kubelet notices the container is gone. (2) It reports the status to the API Server, which writes to etcd. (3) The ReplicaSet Controller sees only 2 Pods when it expects 3. (4) It creates a new Pod spec via the API Server. (5) The Scheduler assigns the new Pod to a Node. (6) The target Node's kubelet starts a replacement container. Total time: 10–30 seconds. The system self-healed because actual state didn't match desired state — the magic of the control loop.
2.2.6 Worker Node Components — Where the Work Happens
Analogy: The Cooking Stations
If the Control Plane is the kitchen management team, Worker Nodes are the cooking stations. Every Worker Node runs three essential components.
kubelet — The Node Agent
The kubelet is the cook at each station. It registers the Node with the cluster, receives Pod specs from the API Server, and ensures containers are running and healthy via the container runtime. It reports Node and Pod status back to the API Server. If a kubelet stops responding, the Node Controller marks the Node NotReady after 40 seconds and begins Pod eviction.
kube-proxy — The Network Rule Manager
The kube-proxy is the food runner who knows all delivery routes. It maintains network rules that implement Kubernetes Services. When you create a ClusterIP Service, kube-proxy configures iptables or IPVS rules so traffic to the Service's virtual IP is load-balanced to backing Pods. Importantly, kube-proxy is not a proxy that traffic flows through — it's a rule manager that configures the kernel's packet filtering. Traffic goes directly from source to destination Pod.
⚠️ Common Misconception: "kube-proxy is an HTTP proxy like nginx." It is not. kube-proxy runs as a DaemonSet and programs the kernel's routing tables. No traffic passes through the kube-proxy process itself.
Container Runtime — The Equipment
The container runtime is the kitchen equipment — ovens, grills, mixers. Kubernetes supports any runtime implementing the Container Runtime Interface (CRI). On GKE, this is containerd, a lightweight runtime that pulls images, creates containers using Linux namespaces and cgroups, and manages their lifecycle. GKE uses containerd, not Docker. Docker is a development tool; containerd is purpose-built for orchestration.
CNI Plugins — The Plumbing
CNI (Container Network Interface) plugins assign Pod IP addresses and ensure every Pod can reach every other Pod. On GKE, CNI uses VPC-native networking, giving each Pod a real IP from your VPC subnet.
2.2.7 GKE Architecture Specifics
Analogy: A Restaurant with an Invisible Management Team
On GKE, Google manages the kitchen management team — invisible but always working. You only interact with the expediter window (the API Server endpoint).
What Google Manages vs. What You Manage
| Component | Standard GKE | Autopilot |
|---|---|---|
| Control Plane (API Server, etcd, Scheduler, Controller Manager) | Google-managed | Google-managed |
| Worker Node OS and container runtime | Google-managed | Google-managed |
| Worker Node provisioning and scaling | You configure node pools | Fully automatic |
| CNI networking | Google-managed (VPC-native) | Google-managed |
| Control plane upgrades | Google-managed | Google-managed |
| Add-ons (CoreDNS, metrics-server) | Google-managed | Google-managed |
Key GKE Architecture Features
- VPC-native networking: Pods get real IPs from your VPC subnet — no overlay, no performance penalty.
- Multi-zonal HA by default: Control plane spans three zones automatically.
- Control plane logs via Cloud Logging: Stream API Server audit logs for security monitoring.
- Master authorized networks: Restrict which IP ranges can reach the API Server.
- Private clusters: The API Server endpoint is only accessible within your VPC.
GKE Note:
kubectl get nodesshows your Workers, butkubectl get pods -n kube-systemwon't show API Server or etcd — they exist in Google's infrastructure. This is both a convenience (no operational burden) and a constraint (no direct control over control plane configuration).
🛑 PAUSE & RECALL — 3 minutes
Without looking back:
- Name all four Control Plane components and their kitchen roles.
- What are the three components on every Worker Node?
- On GKE, what do you manage and what does Google manage?
Rate your confidence (0–4).
2.2.8 Tracing a Deployment Through the System
When you run kubectl apply -f deployment.yaml, here's the flow:
Every step goes through the API Server. No component talks directly to another. etcd writes are sequential and consistent. The control loops ensure that even if a component crashes mid-flow, the system converges to the desired state upon recovery.
Lab: LAB-2.2 — Exploring Kubernetes Architecture (60 min)
Step 1: Examine Your Nodes (10 min)
kubectl get nodes -o wide
kubectl describe node $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}')
Look for: Capacity vs. Allocatable (what Pods can use after system reservations), Conditions (Ready, MemoryPressure), and the container runtime version.
Step 2: View System Pods (10 min)
kubectl get pods -n kube-system -o wide
Look for kube-dns/coredns, fluentbit-gke, and gke-metrics-agent. Notice: no API Server, etcd, Scheduler, or Controller Manager pods — these are Google-managed.
Step 3: Access the Raw REST API (15 min)
kubectl proxy &
curl http://localhost:8001/api/
curl http://localhost:8001/api/v1/pods
kill %1
Everything in Kubernetes is a REST resource with a URL — this is what kubectl talks to under the hood.
Step 4: View Control Plane Logs (10 min)
gcloud logging read "protoPayload.serviceName=\"container.googleapis.com\"" --limit=5 --format="table(timestamp,protoPayload.methodName)"
These logs show who created what resources and when — essential for security auditing.
Step 5: Deploy and Observe Self-Healing — The "Aha" Moment (15 min)
cat << 'EOF' | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-demo
spec:
replicas: 3
selector:
matchLabels:
app: nginx-demo
template:
metadata:
labels:
app: nginx-demo
spec:
containers:
- name: nginx
image: nginx:1.25
resources:
requests:
cpu: "50m"
memory: "64Mi"
EOF
kubectl wait --for=condition=available --timeout=60s deployment/nginx-demo
POD=$(kubectl get pods -l app=nginx-demo -o jsonpath='{.items[0].metadata.name}')
kubectl delete pod $POD --wait=false
kubectl get pods -l app=nginx-demo -w
What you should see: The deleted pod enters Terminating. Within seconds, a new pod appears with a different name. The ReplicaSet Controller noticed actual state (2 pods) didn't match desired state (3) and created a replacement — the control loop in action.
Press Ctrl+C, then clean up: kubectl delete deployment nginx-demo.
Chapter Summary
The Control Plane — API Server, etcd, Scheduler, and Controller Manager — is the cluster's brain. The API Server is the exclusive front door; etcd is the single source of truth; the Scheduler decides placement through filter-then-score; and the Controller Manager's control loops make Kubernetes self-healing. Worker Nodes execute workloads through kubelet, kube-proxy, and containerd.
On GKE, Google manages the entire Control Plane, giving you multi-zonal HA without operational burden. VPC-native networking, node auto-provisioning, control plane logs, and master authorized networks extend core Kubernetes with cloud-native capabilities.
The deepest insight is the control loop: Kubernetes continuously reconciles actual state with desired state. This is why deleting a pod triggers automatic replacement, why scaling is declarative, and why the system is resilient. Understanding this pattern separates administrators who can troubleshoot from those who can only copy-paste.
📇 KEY CONCEPT CARDS
- Q: What are the four Control Plane components and their roles?
A: (1) API Server — front door for all communication, exclusive etcd access; (2) etcd — distributed key-value store for all cluster state, uses Raft consensus; (3) Scheduler — filter-then-score algorithm for Pod-to-Node placement; (4) Controller Manager — runs control loops that reconcile actual state with desired state.
- Q: Why is etcd the most critical component to protect?
A: etcd contains the entire desired state — every Pod, Service, Deployment, Secret. Losing it without a backup means losing all cluster configuration. Running Pods may continue, but Kubernetes cannot recover or reconcile.
- Q: What is the control loop pattern, and why is it fundamental?
A: A continuous cycle: observe current state → compare with desired state → act to reduce the gap → repeat. Controllers never stop reconciling, making the cluster self-healing: it constantly converges reality toward your declared intentions.
- Q: What are the three Worker Node components and their functions?
A: (1) kubelet — node agent that receives Pod specs, manages containers, reports status; (2) kube-proxy — maintains network rules (iptables/IPVS) implementing Service load balancing; (3) containerd — pulls images and executes containers.
- Q: On GKE, what does Google manage vs. what do you manage?
A: Google manages the entire Control Plane, container runtime, CNI, system DaemonSets, control plane upgrades, and etcd backups. On Standard, you manage node pools. On Autopilot, Google manages everything — you only manage Pods and workloads.