← Articulet System Design, Made Clear Chapter 8 · Queues, Streams, and Async Processing
Part 2 · Core Building Blocks Chapter 8

Queues, streams, and async processing.

Why some work should leave the request path and happen later.

Learning objective
Explain clearly when work should stay synchronous, when it should move off the critical path, when a queue is enough, and when an event stream is the better fit.
Before you read

Make a prediction first.

Predict

Answer before the explanation.

Which part of checkout must finish before the user gets a response, and which part can happen later?

Commit

Write a rough answer.

Before reading, list one task that belongs on the request path and one task that should leave it.

Connect

Notice where it returns.

Async thinking returns in video processing, notifications, analytics, feeds, and write-heavy systems.

Concrete first

Not every piece of work should happen while the user is waiting.

If one user action also sends email, generates thumbnails, writes analytics, triggers notifications, and calls three slow services, the clean move is usually not "do it all now."

The more side work you keep inline with the request, the more fragile the user-facing path becomes. A good design usually finishes the essential user action first, then hands the rest to background systems that can retry, buffer, and scale separately.

Mental model

Front desk first. Back room later.

Do what the user is waiting for at the front desk. Hand the rest to the back room.
Front desk critical path Accept request save the essential record and reply Back room background path Queue Workers Email / Thumbs / Analytics hand off the user waits here the rest can finish later
The chapter is really about protecting the front desk. If the user is not waiting for it, it probably should not block the request.
First principles

Why async exists.

Core diagram

One split architecture: critical path on top, background path below.

CRITICAL PATH BACKGROUND PATH U App / API Primary DB Response emit async work after the essential write Queue or stream durable handoff or event publish Worker A Worker B Consumer C Email Analytics Notifications
The main design question is simple: what absolutely must finish before the response, and what can be handed off safely afterward?
Why it matters in interviews

Interviewers want to know whether you can separate essential work from side effects.

Weak
We can use Kafka so everything is scalable.
Strong
I want to keep the user-facing write short, so I'd move email sending, analytics, and other non-urgent follow-up work into background processing. If this is mainly task handoff, a queue is enough. If several downstream systems must react independently, an event stream is more useful.

The strong answer starts with the product and the critical path, then chooses the delivery model.

Key ideas

Seven anchors.

Queue vs stream

Same family. Different purpose.

Queue one task, later worker execution App Queue task buffer Worker Stream one event, many independent reactions App Stream event log Analytics Notifications Recommendations
Queue: "do this task later." Stream: "this event happened; whoever cares may react."
Retries and duplicates

Background success often means doing the same thing twice safely.

A worker picks up a task, times out while writing, and retries. The system may now process the same message twice.

That is why idempotency matters. Good background processing assumes retries will happen. The real design question is whether running the task twice causes harm. If it does, you need idempotency keys, deduplication, or carefully structured writes.

Speaking script

Lines for the async conversation.

Opening
I want to keep the critical path short, so I would move non-urgent side work into async processing.
Sketching
The synchronous path should do the essential user-facing write and return. Background workers can handle the rest.
Deep dive
If the job is basically task handoff, a queue is usually enough. If the same event should feed several systems, a stream is often the better fit.
Trade-off
The gain is lower latency and better decoupling. The cost is eventual completion, retries, and more moving parts.
Defending
I only keep work synchronous if the user truly needs the result before the response can return.
Recovery
If I am unsure, my fallback is simple: keep the essential write inline, and move email, analytics, and other side effects off-path first.
Common mistakes

How candidates make async sound like a free upgrade.

Misconception check

Correct the wrong model before it sticks.

Wrong intuition

What feels tempting

Queues and streams are added when the system needs to sound scalable.

Better model

What to replace it with

Async processing is for work that is slow, retryable, bursty, or not required for the immediate user response.

Interview move

What to do in the room

Draw the user path first, then move side effects behind a queue with retry and idempotency.

Trade-offs

Four processing choices.

Processing choiceGood whenWeak whenInterview line
Fully synchronous The user must know the outcome immediately and the work is small enough to keep inline. Side work is slow, bursty, or failure-prone and makes the request path fragile. I would keep only the truly user-critical work synchronous.
Queue + workers Default One task should be processed later, with retries and buffering. Several independent systems need the same event for different reasons. A queue fits because the main need is durable task handoff to workers.
Stream + consumers One event should be consumed by several downstream systems independently. You only need a simple handoff and do not benefit from broader fan-out. A stream fits when the same event should feed several consumers without tight coupling.
Async everywhere Most side work is naturally decoupled and delay is acceptable. The product needs immediate consistency or confirmation for that work. I would only move this off-path if the product can tolerate eventual completion.
Mini case study

Video upload — accept now, process later.

What the user needs immediately

  • Upload accepted.
  • Metadata recorded.
  • File stored durably.

What can happen later

  • Transcoding into several resolutions.
  • Thumbnail generation.
  • Moderation checks.

Why a queue helps

  • Transcoding is slow.
  • Upload volume can spike.
  • Workers can scale independently.

Where a stream helps

  • Analytics may react.
  • Notifications may react.
  • Recommendation systems may react.
Worked example to solo answer

Fade the support before the real practice.

Do not jump straight from reading to a full answer. First see the shape, then complete part of it, then answer alone.

I do

Study the model move.

I would keep payment authorization on the critical path, then queue email receipts, inventory sync, and analytics.

We do

Complete the missing piece.

For checkout, classify each task as blocking, async, or batchable.

You do

Answer without notes.

Answer the practice prompt and explicitly say what leaves the request path.

Practice

Try it before you read the model answer.

Prompt
Design a checkout system for an online store.
  • What should stay synchronous?
  • What should move to async processing?
  • Does a queue or a stream fit better first?
Show a strong model answer
I would keep payment authorization and order creation on the synchronous path because the user needs immediate confirmation that the purchase succeeded. I would move email sending, analytics updates, and some downstream fulfillment notifications into async processing so the checkout path stays fast and resilient. If the main async need is handing off specific follow-up jobs, a queue is enough. If several independent systems need to react to the order-created event, a stream becomes more useful.
Training loop

Make this chapter stick.

Before moving on, turn recognition into production. Close the model answer, answer from memory, then retry one small slice.

Recall

Say the chapter's core idea without looking. Then name one related idea from an earlier chapter.

Vary

Change one constraint in the practice prompt and answer again in half the time.

Score

Use the rubric to pick one dimension below 3, then retry only that dimension.

Memory hook
Front desk first. Back room later.
Recap

Three things to take into the room.

1

Protect the critical path.

Do the essential user-facing work first and return.

2

Queue for tasks. Stream for fan-out.

That distinction is enough for most interviews.

3

Retries imply duplicates.

Idempotency is the adult answer.

Reusable interview line
"I would keep the user-critical write synchronous, then hand non-urgent side effects to background systems so the request path stays fast and resilient."