← Articulet System Design, Made Clear Chapter 10 · Scaling Reads
Part 3 · Designing at Scale Chapter 10

Scaling reads.

How to serve more readers without making one system do all the work.

Learning objective
Identify the actual read bottleneck and choose the right fix such as caching, replicas, precomputed views, or edge delivery, using language tied directly to the access pattern.
Before you read

Make a prediction first.

Predict

Answer before the explanation.

If many users read the same product data, what are three ways to avoid hitting one database repeatedly?

Commit

Write a rough answer.

Before reading, choose whether this is a cache problem, replica problem, CDN problem, or precompute problem.

Connect

Notice where it returns.

Read scaling returns in feeds, catalogs, URL redirects, profile pages, and media playback.

Concrete first

Read-heavy systems usually break before write-heavy ones do.

Many products are opened far more often than they are changed. One profile gets edited occasionally but viewed constantly. One short URL is created once and clicked thousands of times.

When reads dominate, the problem is not always "the database is slow." Sometimes the same object is fetched repeatedly. Sometimes the query itself is expensive. Sometimes the real pain is assembling the response. Sometimes the content is just too far from the user. The fix should come from the type of slowness.

Mental model

More readers means more copies, simpler paths, or closer answers.

When reads grow, you usually make more copies, make the path simpler, or move the answer closer to the user.
More copies read replicas more capacity Simpler paths Cache View cache or precompute less work per read Closer answers U CDN edge delivery lower latency
This is the whole framework. Before naming a tool, ask: do I need more copies, a simpler path, or a closer answer?
First principles

Four common reasons the read path becomes slow.

Diagnostic map

Name why reads are slow before naming how to fix them.

What is actually slow? Repeated lookup of the same hot objects Too many reads still need the database Building the response is expensive Static/blob content is too far from users Cache hot data same answer, many times Read replicas more DB read capacity Precomputed view simpler response assembly CDN / edge cache closer blob delivery THE DECISION RULE Do not start with the tool. Start with the read bottleneck.
Same symptom category, different fix. The chapter is about matching the move to the bottleneck.
Why it matters in interviews

Bottleneck-driven answers sound like engineering. Tool-driven answers sound like guessing.

Weak
We can scale reads with Redis.
Strong
If the issue is repeated lookup of the same hot objects, caching is the first fix. If the issue is too many database reads that still need fresh data, I would add read replicas. If building the response is the expensive part, I would precompute the read model. If the content is static or blob-heavy and globally accessed, I would use a CDN.

The strong answer chooses the technique from the shape of the problem, not from habit.

Key ideas

Seven anchors.

Speaking script

Lines for the read-scaling conversation.

Opening
I want to scale reads based on where the read path is actually getting expensive.
Sketching
If many users ask the same question repeatedly, caching is my first move.
Deep dive
If reads still need the database but the primary is overloaded, I would add read replicas. If the expensive part is building the response, I would precompute the read model instead of reconstructing it on every request.
Trade-off
The gain is better latency and more read capacity. The cost is usually more staleness, more complexity, or both.
Extending
If content is static or blob-heavy and globally accessed, I would push it closer to users with a CDN.
Defending
I would rather add the smallest justified read-scaling move first than throw cache, replicas, and CDN at the system all at once.
Common mistakes

How candidates flatten very different read problems into one answer.

Misconception check

Correct the wrong model before it sticks.

Wrong intuition

What feels tempting

Scaling reads means adding Redis.

Better model

What to replace it with

Read scaling is about serving copies safely: caches, replicas, CDNs, indexes, and precomputed views all help different read patterns.

Interview move

What to do in the room

Name the read path, freshness tolerance, and source of truth before picking the read-scaling tool.

Trade-offs

Five read-scaling choices.

Read-scaling choiceGood whenWeak whenInterview line
Keep reads on one primary store Traffic is still small and the read path is simple. The primary becomes the read bottleneck or latency is too high. I would keep the first version simple until reads clearly pressure the primary path.
Cache hot data Default The same objects are requested repeatedly and slight staleness is acceptable. Reads are highly dynamic or invalidation is harder than the saved latency is worth. Caching helps if the same hot objects are being fetched over and over.
Read replicas Many reads still need database queries and the primary should stop serving all of them. The main pain is stale-sensitive reads, expensive joins, or response assembly rather than raw read volume. Read replicas help when I need more database read capacity without sending every read to the primary.
Precomputed read view Building the response is expensive, such as feeds, rankings, or aggregates. The read is simple enough that precomputation adds unnecessary complexity. If the cost is assembling the view, I would precompute that view instead of rebuilding it on every request.
CDN or edge cache Static or blob content is globally requested and repeat reads are common. Content is highly private, rarely read, or must always be fetched from origin. A CDN helps when the read problem is global content delivery, not just database pressure.
Mini case study

News feed — not one read problem, but three.

Simple read path

  • Read recent posts from storage.
  • Fine when the system is still small.
  • No extra machinery yet.

Repeated hot reads

  • Users reopen the same feed often.
  • Cache feed fragments or session reads.
  • Good when slight staleness is acceptable.

Expensive assembly

  • The feed is expensive to build live.
  • Precompute or partially materialize it.
  • That attacks computation, not just storage.

Media delivery

  • Images and video should not always come from origin.
  • Serve blobs through a CDN.
  • That solves distance, not ranking logic.
Worked example to solo answer

Fade the support before the real practice.

Do not jump straight from reading to a full answer. First see the shape, then complete part of it, then answer alone.

I do

Study the model move.

I would cache product details that change rarely, but keep inventory or price freshness rules explicit.

We do

Complete the missing piece.

For the catalog prompt, mark which reads can be stale and which ones affect money or trust.

You do

Answer without notes.

Design the read path with one copy layer and one fallback path.

Practice

Try it before you read the model answer.

Prompt
Design a public product catalog with millions of views and relatively few updates.
  • What is the main read bottleneck likely to be?
  • Would you use cache, replicas, CDN, or precomputed views?
  • What trade-off are you accepting?
Show a strong model answer
I would expect the system to be read-heavy, with many users viewing the same product pages far more often than products are updated. My first move would be caching product page data because repeated reads are likely the main pattern. If the application still needs many database reads, I would add read replicas to take pressure off the primary. If product images are a major part of the traffic, I would serve them through a CDN. The main trade-off is better read latency and capacity in exchange for more staleness management and more moving parts.
Training loop

Make this chapter stick.

Before moving on, turn recognition into production. Close the model answer, answer from memory, then retry one small slice.

Recall

Say the chapter's core idea without looking. Then name one related idea from an earlier chapter.

Vary

Change one constraint in the practice prompt and answer again in half the time.

Score

Use the rubric to pick one dimension below 3, then retry only that dimension.

Memory hook
More readers means more copies, simpler paths, or closer answers.
Recap

Three things to take into the room.

1

Name why reads are slow.

Repeated lookup, DB pressure, expensive assembly, or distant content.

2

Pick the smallest justified fix.

Cache, replicas, precompute, or CDN.

3

Do not confuse capacity with computation.

Replicas add read capacity. They do not make bad queries cheap.

Reusable interview line
"I would first name why the read path is slow, then choose the smallest justified move: cache for repeated hot reads, replicas for database read pressure, precomputed views for expensive assembly, and CDN for global blob delivery."