← Articulet System Design, Made Clear Chapter 11 · Scaling Writes
Part 3 · Designing at Scale Chapter 11

Scaling writes.

How to accept more writes without losing control of correctness.

Learning objective
Identify the real write bottleneck and choose the right move such as queueing, batching, partitioning, or safer ID generation without losing sight of contention and correctness.
Before you read

Make a prediction first.

Predict

Answer before the explanation.

What breaks first when writes grow: the database CPU, hot keys, ordering, durability, or downstream consumers?

Commit

Write a rough answer.

Before reading, name the exact write being accepted and the promise made after acceptance.

Connect

Notice where it returns.

Write scaling returns in analytics, messages, notifications, feed fan-out, and video processing.

Concrete first

Writes are harder because they change shared truth.

A read asks for a copy of the truth. A write tries to change the truth. That difference is why write scaling usually hurts earlier and more sharply than people expect.

When too many writes hit the same path, the problem is usually not just volume. The real issue is contention, coordination, burstiness, or a hidden central bottleneck like one hot counter or one overloaded partition.

Mental model

Too many writers, one pen.

Write scaling is usually about reducing the fight over one narrow path.
W W W W One pen hot path / hotspot Queue Batch Partition ID strategy reduce contention before adding complexity
The first question is not "how do I shard it?" The first question is "what are the writers fighting over?"
First principles

Four common reasons writes back up.

Diagnostic map

Why are writes backing up?

Why are writes backing up? Traffic is bursty Many small writes create high overhead One partition or key is too hot One central allocator controls every write Queue bursts steady downstream writes Batch writes fewer storage operations Repartition spread ownership better Distributed IDs avoid one hot counter THE DECISION RULE Reduce contention first. Then add structure.
Queueing, batching, repartitioning, and safer ID generation solve different write problems. The chapter is about telling them apart.
Why it matters in interviews

Write-scaling answers should sound contention-aware, not slogan-driven.

Weak
We can shard the database.
Strong
If writes are spiky, I would first buffer them with a queue. If one partition is getting too many writes, I would spread ownership across partitions. If every server is generating IDs, I need a scheme that guarantees uniqueness without a single bottleneck.

The stronger answer connects the kind of pressure to the smallest justified fix.

Key ideas

Eight anchors.

Speaking script

Lines for the write-scaling conversation.

Opening
I want to scale writes by identifying where contention is happening, not by assuming every system needs early sharding.
Sketching
If writes arrive in bursts, my first move may be to buffer them with a queue so the storage layer can absorb them steadily.
Deep dive
If many small writes can be combined safely, batching reduces overhead. If one partition is overloaded, I need to spread writes by a better partition key.
Trade-off
The gain is higher write capacity. The cost is usually more complexity, more eventual consistency, or harder debugging.
Extending
If every server must create unique IDs, I need a scheme that avoids collisions without turning one allocator into the bottleneck.
Defending
I would rather fix the specific contention point first than jump directly to full sharding.
Common mistakes

How candidates make write scaling sound easier than it is.

Misconception check

Correct the wrong model before it sticks.

Wrong intuition

What feels tempting

Scaling writes mainly means sharding the database.

Better model

What to replace it with

Write scaling means accepting writes safely, partitioning pressure, buffering bursts, and preserving the correctness promise.

Interview move

What to do in the room

Separate acceptance from processing, then decide what must be durable before returning success.

Trade-offs

Five write-scaling choices.

Write-scaling choiceGood whenWeak whenInterview line
Keep a simple single write path Write volume is still manageable and correctness is easier with one primary path. The write path is saturated or one node is the obvious bottleneck. I would keep the write path simple until contention or saturation actually shows up.
Queue and absorb bursts Default Writes arrive unevenly and some completion can happen shortly after the request. The user must know the final durable result immediately for every step. A queue helps if the main problem is bursty write load rather than strict immediate completion.
Batch writes Many small writes can be grouped without harming the product requirement. Each write must be visible or durable immediately and independently. Batching helps when the overhead per write is the problem and slight delay is acceptable.
Partition or shard writes Writes can be spread naturally by key, tenant, region, or ownership. The partition key creates hotspots or queries become much harder than before. Partitioning helps only if the chosen key actually spreads write ownership well.
Coordinated or distributed ID generation Many servers create records concurrently and uniqueness must hold. One central allocator becomes the write bottleneck or collision handling is weak. I need an ID strategy that guarantees uniqueness without forcing every write through one hot counter.
Mini case study

URL shortener — the write path is small, but uniqueness is not free.

Core write

  • Client submits a long URL.
  • Service creates a short code.
  • Service stores the mapping.

Where it gets hard

  • Multiple servers create codes concurrently.
  • Collisions must not happen.
  • One global allocator can become hot.

Clean progression

  • Small system: one generator is fine.
  • Larger system: give writers safe ID ranges or machine-aware IDs.
  • Hot tenants: partition ownership better.

Separate side writes

  • Analytics writes can often be queued or batched.
  • The mapping write needs strict uniqueness immediately.
  • Do not mix those two lanes casually.
Worked example to solo answer

Fade the support before the real practice.

Do not jump straight from reading to a full answer. First see the shape, then complete part of it, then answer alone.

I do

Study the model move.

I would accept click events quickly into a durable log, then process aggregates asynchronously.

We do

Complete the missing piece.

For ad clicks, identify the hot partition risk and one way to spread writes.

You do

Answer without notes.

Answer the practice prompt with acceptance, buffering, processing, and failure handling as separate steps.

Practice

Try it before you read the model answer.

Prompt
Design a system that records large volumes of click events from ads.
  • What is the likely write bottleneck?
  • Would you batch, queue, partition, or change ID generation?
  • What trade-off are you accepting?
Show a strong model answer
I would expect bursty high-volume writes, so my first concern would be protecting the storage layer from direct spikes. I would likely use a queue to absorb bursts and workers to write steadily downstream. If many small events are being persisted, batching would reduce per-write overhead. I would partition by a key that spreads traffic well, such as time plus source or tenant, so one hot partition does not take all writes. The main trade-off is better write throughput and resilience versus more delayed visibility and more operational complexity.
Training loop

Make this chapter stick.

Before moving on, turn recognition into production. Close the model answer, answer from memory, then retry one small slice.

Recall

Say the chapter's core idea without looking. Then name one related idea from an earlier chapter.

Vary

Change one constraint in the practice prompt and answer again in half the time.

Score

Use the rubric to pick one dimension below 3, then retry only that dimension.

Memory hook
Too many writers, one pen.
Recap

Three things to take into the room.

1

Name what writers are fighting over.

A hot key, a hot partition, a bursty path, or a central allocator.

2

Reduce contention before redesigning everything.

Queue, batch, repartition, or fix ID generation.

3

Separate critical writes from side-effect writes.

They rarely deserve the same latency and correctness contract.

Reusable interview line
"I would first name where the write contention lives, then choose the smallest justified move: queue bursts, batch tiny writes, repartition hotspots, or change the ID strategy so one allocator does not control every write."