INTERNALS · RATE LIMITERS live

Four limiters, one stream

The same request pattern fed into Token Bucket, Leaky Bucket, Sliding Window, and GCRA — running locally in your browser. GCRA is the one Stripe / Cloudflare / Discord actually use; this shows why.

Same input, four answers.

One traffic generator at the top emits a deterministic stream of requests. Every algorithm sees the identical stream — they just disagree on what to do with it. Token Bucket and GCRA admit the same total when traffic stays within burst credits; Leaky Bucket smooths into a steady output; Sliding Window admits in groups bounded by its window edge.

On a steady pattern, all four look identical. The differences only emerge when traffic gets bursty — that's why it's an interesting comparison.

Try this in order:

1Leave it on Steady for ~5 seconds. All four cards stay green — no denials anywhere.
2Switch to Spike. Now you see the divergence: each algorithm has its own answer to the same 12-in-100ms burst.
3Drag BURST down to 1. Token Bucket and GCRA admit one per spike; Leaky Bucket gradually drains; Sliding Window's denial bars cluster at window edges.
4Watch the STATE readouts: tokens dripping, queue draining, sample count rising, and GCRA's tat-now — that's the entire state for GCRA, one number.
5Bump RATE high. Steady patterns still pass clean; bursty ones reveal each algorithm's smoothing characteristic.

When each one wins:

Token Bucket — natural fit when you want to allow bursts up to a cap. AWS API Gateway, GitHub's secondary limit.
Leaky Bucket — output shaping, not just admission. Network gear, message brokers.
Sliding Window — accuracy at boundary; what most "N requests per window" specs mean. Common at the API edge.
GCRA — single timestamp per key. With a million users, you store one number each — no array, no atomic decrement. Stripe, Cloudflare, Discord.

Read the war story: a Redis-backed sliding-window limiter I shipped to production at 50M+ req/day, including the bug I found at the window boundary and the Lua script that fixed it.

RATE 10 rps

BURST 5

SPEED 1×

PATTERN

Token BucketAWS API GW · GitHub

Refill at rate; capacity = burst.

ALLOW0

DENY0

DENY %0.0%

STATEtokens: 5.00 / 5

−12sallow / deny over the last 12 secondsnow

Leaky BucketNetwork shaping (ATM, MQTT)

Queue drains at rate; reject when full.

ALLOW0

DENY0

DENY %0.0%

STATEqueue: 0.00 / 5

−12sallow / deny over the last 12 secondsnow

Sliding WindowShopify · Cloudflare WAF

Per-request timestamps inside last window.

ALLOW0

DENY0

DENY %0.0%

STATEsamples: 0 / 5

−12sallow / deny over the last 12 secondsnow

GCRAStripe · Cloudflare redis-cell · Discord

Single TAT + math. O(1) state per key.

ALLOW0

DENY0

DENY %0.0%

STATEtat-now: 0ms / τ 500ms

−12sallow / deny over the last 12 secondsnow

→ Drag BURST to 1 with Spike pattern to see the difference clearly: Token Bucket and GCRA admit a single request and reject the rest of each spike; Leaky Bucket smooths but still drops. Sliding Window's denials cluster at window edges.