Building a Rate Limiter in TypeScript
A practical walkthrough of implementing a sliding window rate limiter using Cloudflare Workers KV.
Rate limiting is one of those things that sounds simple until you actually implement it. A naive counter resets at fixed intervals, which means a burst at the end of one window plus a burst at the start of the next can double your intended limit. A sliding window fixes that.
The sliding window algorithm
Instead of resetting a counter every minute, we track the timestamps of each request and count only those within the last 60 seconds.
type RateLimitResult = { allowed: boolean; remaining: number; resetIn: number; // ms until oldest request falls outside the window};
function slidingWindow( timestamps: number[], windowMs: number, limit: number,): RateLimitResult { const now = Date.now(); const windowStart = now - windowMs;
// Drop timestamps outside the window const recent = timestamps.filter((t) => t > windowStart);
return { allowed: recent.length < limit, remaining: Math.max(0, limit - recent.length), resetIn: recent.length > 0 ? recent[0]! - windowStart : 0, };}Persisting state with Cloudflare KV
For a distributed system you need shared state. KV is eventually consistent, but for rate limiting that’s an acceptable trade-off — a few extra requests slipping through beats adding a Redis dependency.
async function checkRateLimit( kv: KVNamespace, key: string, limit: number, windowMs: number,): Promise<RateLimitResult> { const stored = (await kv.get(key, "json")) as number[] | null; const timestamps = stored ?? [];
const result = slidingWindow(timestamps, windowMs, limit);
if (result.allowed) { const now = Date.now(); const windowStart = now - windowMs;
// Persist only the timestamps still in the window, plus this request const updated = [...timestamps.filter((t) => t > windowStart), now]; await kv.put(key, JSON.stringify(updated), { expirationTtl: Math.ceil(windowMs / 1000), }); }
return result;}Wiring it up in a Worker
export default { async fetch(request: Request, env: Env): Promise<Response> { const ip = request.headers.get("cf-connecting-ip") ?? "unknown"; const key = `rate:${ip}`;
const { allowed, remaining, resetIn } = await checkRateLimit( env.RATE_LIMIT_KV, key, 100, // 100 requests 60 * 1000, // per 60 seconds );
if (!allowed) { return new Response("Too Many Requests", { status: 429, headers: { "Retry-After": String(Math.ceil(resetIn / 1000)), "X-RateLimit-Remaining": "0", }, }); }
return new Response("OK", { headers: { "X-RateLimit-Remaining": String(remaining), }, }); },};A note on atomicity
This implementation has a race condition: two requests can read the same stale list simultaneously and both pass the limit check. For most APIs this is fine — you’re adding friction, not building a vault. If you need hard guarantees, use Durable Objects instead, which give you single-threaded execution per key.
// Durable Object approach — no race conditionexport class RateLimiter implements DurableObject { private timestamps: number[] = [];
async fetch(request: Request): Promise<Response> { const { allowed, remaining } = slidingWindow(this.timestamps, 60_000, 100);
if (allowed) { this.timestamps.push(Date.now()); }
return Response.json({ allowed, remaining }); }}The tradeoff is cost and latency — each Durable Object request routes to a specific edge location, which can add a round-trip. For most use cases, the KV approach is good enough.
Back to all posts