← Articulet Kubernetes Zero to Hero Chapter 5.3
Module 5 Storage, Configuration, and Secrets

StatefulSets and Stateful Applications

You now understand PersistentVolumes, ConfigMaps, and Secrets — the raw materials of storage and configuration. But here is a puzzle: what happens when you deploy a database as a Deployment with three replicas? Each pod...

Chapter 13 of 22

You now understand PersistentVolumes, ConfigMaps, and Secrets — the raw materials of storage and configuration. But here is a puzzle: what happens when you deploy a database as a Deployment with three replicas? Each pod gets a random suffix like postgres-7d9f4b8c5-x2kpv, starts in any order, and shares no stable identity. When it dies and a randomly-named replacement appears, how does the cluster know this member was the primary? It cannot. Deployments treat pods as interchangeable cattle. Stateful applications need pets with name tags and permanent lockers.

Analogy: Hotel Room Assignments

At a hostel (the Deployment), the desk says "Any bed is fine." You get bed 3 tonight, bed 7 tomorrow. Your locker? Someone else is using it. The hostel guarantees a bed, never the same bed.

At a hotel (the StatefulSet), you book Room 101. Every time, you get Room 101. Your belongings are still in the closet. Dial 101 and you reach Room 101. Rooms are numbered sequentially — they do not skip. If Room 101 is renovated, Room 102 waits until 101 is ready. The StatefulSet promise: stable identity, stable storage, and ordered operations.

Why StatefulSets Exist

Deployments create pods with random suffixes, start them simultaneously, and attach storage through shared PVCs. For a stateless web server, this is ideal. But for a MongoDB replica set, each member must know its role, reach peers by predictable names, own dedicated storage, and start in the correct order. Three MongoDB replicas with random names racing to initialize — potentially clobbering the same data directory — is a recipe for corruption. StatefulSets eliminate this chaos.

StatefulSet Characteristics: The Four Pillars

Stable Network Identity. Every pod receives a predictable name: <statefulset-name>-<ordinal>. A StatefulSet named web with three replicas produces web-0, web-1, and web-2. If web-1 crashes, the replacement is also named web-1. Each pod gets a stable DNS entry: web-1.web.default.svc.cluster.local.

Stable Storage. StatefulSets use a volumeClaimTemplate to request one PVC per pod. web-0 gets data-web-0; web-1 gets data-web-1. These PVCs survive pod deletions and reattach to the matching ordinal. Your data stays in your closet.

Ordered Operations. Pods are created sequentially: web-0 must be Running and Ready before web-1 starts. Scaling down reverses the order: web-2 terminates first, then web-1.

Headless Service Requirement. A StatefulSet requires a headless Service (clusterIP: None) to give each pod its own DNS A-record. Unlike a normal Service returning a single virtual IP, a headless Service returns actual pod IPs — one per pod — enabling direct pod-to-pod discovery. This is the internal phone system connecting rooms by name.

Visual Description: Hotel Floor Plan

Picture a hotel floor with three sequentially numbered rooms, each with its own dedicated closet, connected through an internal telephone exchange.

graph LR subgraph "Headless Service: Phone System" HS["web-svc<br/>(clusterIP: None)"] end subgraph "StatefulSet Floor: web" subgraph "Room 101" POD0["web-0"] PVC0[("PVC: data-web-0")] POD0 --- PVC0 end subgraph "Room 102" POD1["web-1"] PVC1[("PVC: data-web-1")] POD1 --- PVC1 end subgraph "Room 103" POD2["web-2"] PVC2[("PVC: data-web-2")] POD2 --- PVC2 end end HS -->|"web-0.web"| POD0 HS -->|"web-1.web"| POD1 HS -->|"web-2.web"| POD2 style HS fill:#a5d6a7,stroke:#2e7d32 style POD0 fill:#90caf9,stroke:#1565c0 style POD1 fill:#90caf9,stroke:#1565c0 style POD2 fill:#90caf9,stroke:#1565c0 style PVC0 fill:#ffcc80,stroke:#ef6c00 style PVC1 fill:#ffcc80,stroke:#ef6c00 style PVC2 fill:#ffcc80,stroke:#ef6c00

🛑 PAUSE & RECALL — 2 minutes

  1. Why does a Deployment with random pod names fail for a MongoDB cluster? (Hint: identity and order)
  2. What are the four guarantees a StatefulSet provides that a Deployment does not?
  3. What Kubernetes Service type does a StatefulSet require, and why can a normal ClusterIP Service not do the job?

Rate your confidence (0–4), then continue.

StatefulSet vs Deployment

Aspect Deployment StatefulSet
Pod naming Random: web-7d9f4b8c5-x2kpv Ordinal: web-0, web-1, web-2
Storage Shared PVC (all pods mount same claim) One PVC per pod via volumeClaimTemplate
Creation order All pods start simultaneously Sequential: 0, then 1, then 2
Scale-down order Random termination Reverse: 2, then 1, then 0
Service type Any Service works Requires headless Service (clusterIP: None)
Pod DNS Not directly resolvable <pod>.<service>.<ns>.svc.cluster.local
Best for Web servers, APIs, stateless apps Databases, message queues, distributed systems
graph TD subgraph "Deployment: The Hostel" D["Deployment: web"] D --> P1["web-7d9f4-x2kpv"] D --> P2["web-7d9f4-b3mna"] D --> P3["web-7d9f4-k9pqc"] PVC_SHARED[("Shared PVC")] P1 -.-> PVC_SHARED P2 -.-> PVC_SHARED P3 -.-> PVC_SHARED end subgraph "StatefulSet: The Hotel" S["StatefulSet: web"] S --> S1["web-0"] S --> S2["web-1"] S --> S3["web-2"] PVC0[("data-web-0")] PVC1[("data-web-1")] PVC2[("data-web-2")] S1 --- PVC0 S2 --- PVC1 S3 --- PVC2 end style D fill:#ef9a9a,stroke:#c62828 style S fill:#a5d6a7,stroke:#2e7d32 style P1 fill:#ffcdd2,stroke:#c62828 style P2 fill:#ffcdd2,stroke:#c62828 style P3 fill:#ffcdd2,stroke:#c62828 style S1 fill:#c8e6c9,stroke:#2e7d32 style S2 fill:#c8e6c9,stroke:#2e7d32 style S3 fill:#c8e6c9,stroke:#2e7d32 style PVC_SHARED fill:#ffcc80,stroke:#ef6c00 style PVC0 fill:#ffcc80,stroke:#ef6c00 style PVC1 fill:#ffcc80,stroke:#ef6c00 style PVC2 fill:#ffcc80,stroke:#ef6c00

⚠️ Common Misconception: Many learners think StatefulSets are "just Deployments with PVCs." This misses the essential difference: the ordered identity system. Without a headless Service and ordered naming, a StatefulSet is like a hotel where rooms have no numbers — guests cannot find each other, and nobody gets their original room back.

Common Stateful Patterns

Databases. MySQL replication, PostgreSQL with Patroni, MongoDB replica sets, and Cassandra clusters depend on pod identity for node discovery and role assignment. Each node must know: "Am I node 0 (the primary) or node 1 (a replica)?"

Message Queues. Kafka brokers are identified by broker ID and each owns specific topic partitions on local storage. RabbitMQ clustering uses node names for peer discovery.

Distributed Coordination. ZooKeeper and etcd form consensus clusters where each member votes in leader election. Changing a member's name mid-flight breaks quorum.

The Operator Pattern. Operators are custom controllers that automate stateful application lifecycle — handling backups, failover, and upgrades. Think of an Operator as an automated hotel management system: it monitors rooms and reassigns guests when needed. The CloudNativePG and Strimzi operators are excellent examples.

Running Databases on Kubernetes: The Honest Discussion

Should you run your database on Kubernetes or use a managed service? The answer: it depends.

Use a managed database (Cloud SQL, Cloud Spanner, AlloyDB) when you want automated backups, patching, failover, and SLAs without operational overhead. For most applications, this is the right choice.

Run self-hosted on Kubernetes when you need specific engine versions, extreme latency demanding data proximity, or multi-tenant isolation. StatefulSets provide the foundation, but you carry the burden: backup verification, failover testing, and disaster recovery drills.

If you choose self-hosted: use the fastest StorageClass for IOPS, implement both VolumeSnapshots and logical backups with regular restore testing, use application-native replication rather than relying on Kubernetes alone, and run databases on dedicated node pools to prevent noisy-neighbor interference.

GKE in Practice

GKE Note: Google Kubernetes Engine provides several capabilities designed specifically for stateful workloads.

Hyperdisk is GKE's next-generation block storage offering dramatically higher IOPS than standard Persistent Disks. For production databases, use hyperdisk-balanced StorageClass — like upgrading from a standard closet to a climate-controlled vault.

Filestore provides managed NFS-based shared storage with ReadWriteMany (RWX) access, essential when multiple pods need simultaneous read-write access to the same filesystem.

Backup for GKE is Google's application-aware backup service. Unlike VolumeSnapshots alone, it captures both persistent volumes and Kubernetes resource definitions (StatefulSets, Services, ConfigMaps, Secrets) atomically. A PVC backup without the StatefulSet definition is like having a guest's luggage with no record of which room they stayed in.

🤔 TRY BEFORE YOU SEE

Design a three-node ZooKeeper ensemble. ZooKeeper requires: each node knows its own ID (1, 2, 3); each node reaches peers by stable hostname; each node writes transaction logs to dedicated storage; node 1 must start before nodes 2 and 3 join.

Write your answers before reading the solution:

  1. Which Kubernetes workload controller should you use, and why?
  2. What additional Kubernetes object is required for pod-to-pod discovery?
  3. Sketch roughly what the pod names and DNS entries would look like.

Reveal: Use a StatefulSet for ordered identity and storage. A headless Service (clusterIP: None) provides DNS entries like zk-0.zk.default.svc.cluster.local. The volumeClaimTemplate gives each pod its own PVC. Ordered startup guarantees zk-0 initializes before zk-1 and zk-2. ZooKeeper references stable DNS names for peer discovery, and each node uses its ordinal as its myid.

sequenceDiagram participant U as "User: scale --replicas=3" participant CP as "Control Plane" participant P0 as "Pod web-0" participant P1 as "Pod web-1" participant P2 as "Pod web-2" U->>CP: Scale to 3 CP->>P0: Create web-0 P0-->>CP: Running & Ready CP->>P1: Create web-1 P1-->>CP: Running & Ready CP->>P2: Create web-2 P2-->>CP: Running & Ready

🛑 PAUSE & RECALL — 2 minutes

  1. A StatefulSet named redis with 3 replicas is scaled down to 1. Which pods terminate, and in what order?
  2. If redis-1 crashes and is replaced, what is the new pod's name? Does it get a new PVC or reuse the old one?
  3. In the hotel analogy, what does the headless Service represent? What would happen with a normal ClusterIP Service instead?

Rate your confidence (0–4), then continue.

Lab: LAB-5.3 — Stateful Applications (60 min)

This lab demonstrates the magic of StatefulSets: ordered creation, stable identity, and persistent storage. Watch pods appear one by one with predictable names — then prove data survives pod deletion.

Step 1: Create the Headless Service

Create web-service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: web
spec:
  selector:
    app: web
  ports:
    - port: 80
      name: http
  clusterIP: None
kubectl apply -f web-service.yaml

Step 2: Create the StatefulSet

Create web-statefulset.yaml:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: "web"
  replicas: 1
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: nginx
        image: nginx:1.25
        ports:
        - containerPort: 80
          name: http
        volumeMounts:
        - name: data
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 1Gi
kubectl apply -f web-statefulset.yaml
kubectl get pods -w

Expected: Pod web-0 reaches Running — a predictable ordinal name, not a random hash.

Step 3: Write Data to web-0

kubectl exec -it web-0 -- /bin/bash -c \
  'echo "Hello from web-0" > /usr/share/nginx/html/index.html'

Step 4: Scale Up and Watch the Magic (0 → 1 → 2)

kubectl scale statefulset web --replicas=3
kubectl get pods -w

Expected sequence: web-0 Ready → web-1 starts → web-1 Ready → web-2 starts. Sequential. Ordered. Named.

Step 5: Verify Dedicated PVCs

kubectl get pvc

Expected: data-web-0, data-web-1, data-web-2 — one PVC per pod.

Step 6: Verify Stable DNS

kubectl run -it --rm debug --image=busybox:1.36 --restart=Never -- nslookup web-0.web

Expected: web-0.web resolves to the pod's IP. Try web-1.web and web-2.web too.

Step 7: Write Unique Data to Each Pod

kubectl exec -it web-1 -- /bin/bash -c \
  'echo "Hello from web-1" > /usr/share/nginx/html/index.html'
kubectl exec -it web-2 -- /bin/bash -c \
  'echo "Hello from web-2" > /usr/share/nginx/html/index.html'

Step 8: Test Ordered Scale-Down (2 → 1 → 0)

kubectl scale statefulset web --replicas=1
kubectl get pods -w

Expected: web-2 terminates first, then web-1. Only web-0 remains. PVCs for web-1 and web-2 survive — Kubernetes never auto-deletes PVCs.

Step 9: Prove Data Persistence

kubectl delete pod web-0
kubectl get pods -w

A new web-0 appears with the same name. Verify data survived:

kubectl exec -it web-0 -- cat /usr/share/nginx/html/index.html

Expected: Hello from web-0 — PVC data-web-0 reattached to the replacement pod.

Step 10: Cleanup

kubectl delete statefulset web
kubectl delete service web
kubectl delete pvc data-web-0 data-web-1 data-web-2

Chapter Summary

StatefulSets solve a problem Deployments were never designed to address: identity. When your application needs to know who it is, reach peers by name, and keep data across restarts, StatefulSets provide stable network identity, dedicated persistent storage, and ordered lifecycle management. The hotel analogy captures it: numbered rooms, personal closets, and an internal phone system. On GKE, pair StatefulSets with Hyperdisk for performance, Filestore for shared storage, and Backup for GKE for application-aware disaster recovery.

📇 KEY CONCEPT CARDS

  1. Q: What are the four guarantees a StatefulSet provides that a Deployment does not?
    A: (1) Stable network identity with predictable pod names using ordinal indices, (2) Stable persistent storage with one dedicated PVC per pod via volumeClaimTemplates, (3) Ordered deployment and scaling where pods start and terminate sequentially, (4) Individual pod DNS resolution through a required headless Service.
  1. Q: Why does a StatefulSet require a headless Service instead of a normal ClusterIP Service?
    A: A normal ClusterIP returns a single virtual IP load-balanced across pods. A headless Service (clusterIP: None) returns DNS A-records for each individual pod, enabling direct pod-to-pod communication by stable hostname like web-1.web.default.svc.cluster.local.
  1. Q: When scaling a StatefulSet named db from 3 to 5 replicas, what pod names are created and in what order?
    A: db-3 starts only after db-2 is Ready, then db-4 starts only after db-3 is Ready. Names use the StatefulSet name plus the ordinal index — never random.
  1. Q: What happens to a StatefulSet pod's PVC when the pod is deleted? What about when the StatefulSet is scaled down?
    A: When deleted, the PVC persists and reattaches to the replacement pod with the same ordinal name. When scaled down, PVCs are NOT automatically deleted — they remain to protect data and must be cleaned up manually.