Rate Limit Service

Database-backed rate limiting with pluggable storage backends for the Next.js Prisma SaaS Kit

The rate limit service provides sliding-window rate limiting with database, Redis, or memory backends. Use createRateLimitService() to create a singleton, then call limit(key, {windowSeconds, max}) to check if a request should be allowed.

This guide is part of the Database Configuration documentation.

Sliding window rate limiting is an algorithm that limits requests within a moving time window, providing smoother traffic control than fixed windows. It tracks request counts per key and resets the counter when the window expires.

Rate Limit Service

Implement rate limiting in your application.

Overview

The rate limit service provides configurable rate limiting using a sliding window algorithm. It supports multiple storage backends:

  • Database (default, recommended) - Persistent, works across server instances
  • Secondary Storage - Redis/KV stores with TTL support for high-traffic APIs
  • Memory - Instance-local only; not recommended for production (limits reset on restart and don't sync across instances)

Usage

Basic Usage

import { createRateLimitService } from '@kit/database';
const rateLimitService = createRateLimitService();
const result = await rateLimitService.limit('upload:image:user123', {
windowSeconds: 60,
max: 10,
});
if (!result.success) {
return new Response('Rate limited', {
status: 429,
headers: { 'X-Retry-After': String(result.retryAfter) },
});
}

Response Shape

interface RateLimitDecision {
success: boolean; // Whether request is allowed
remaining: number; // Requests remaining in window
limit: number; // Max requests per window
resetAt: number; // Timestamp when window resets
retryAfter: number | null; // Seconds until retry (if blocked)
key: string; // The rate limit key
}

Storage Backends

Database (Default)

Uses PostgreSQL via Prisma. Schema auto-generated by Better Auth.

const service = createRateLimitService(); // Uses default db
// or
const service = createRateLimitService({ database: customDb });

Secondary Storage (Redis/KV)

For external stores like Redis, Upstash, or Vercel KV:

import { createRateLimitService, createSecondaryRateLimitStorageFactory } from '@kit/database';
const storage = createSecondaryRateLimitStorageFactory({
get: (key) => redis.get(key),
set: (key, value, ttlSeconds) => redis.set(key, value, { ex: ttlSeconds }),
});
const service = createRateLimitService({ storage });

Key Naming Conventions

Use namespaced keys to separate rate limits:

Use CaseKey FormatExample
API endpointapi:{endpoint}:{userId}api:upload:user_123
Auth actionauth:{action}:{identifier}auth:login:user@example.com
Featurefeature:{name}:{userId}feature:export:user_123

Configuration

Environment Variables

VariableDescriptionDefault
BETTER_AUTH_RATE_LIMIT_STORAGEStorage backend: database, memory, secondary-storagedatabase
UPLOAD_RATE_LIMIT_MAXMax uploads per window10
UPLOAD_RATE_LIMIT_WINDOWWindow in seconds60

Implementation Details

Atomic Operations

The database service uses a single upsert query:

INSERT INTO rate_limit (id, key, count, last_request)
VALUES ($id, $key, 1, $now)
ON CONFLICT (id) DO UPDATE SET
count = CASE
WHEN last_request + $windowMs < $now THEN 1
ELSE count + 1
END,
last_request = $now
WHERE (last_request + $windowMs) <= $now OR count < $max
RETURNING count, last_request;

This ensures:

  • No race conditions
  • Single database round-trip
  • Automatic window reset

Schema

CREATE TABLE "rate_limit" (
"id" text PRIMARY KEY NOT NULL,
"key" text,
"count" integer,
"last_request" bigint
);

The id column IS the rate limit key (e.g., upload:image:user123).

Best Practices

Create singleton at module level - Avoid creating service per request

// Good
const rateLimitService = createRateLimitService();
export async function handler() { ... }
// Bad
export async function handler() {
const service = createRateLimitService(); // New instance per request
}

Rate limit after auth - Don't consume rate limit for unauthenticated requests

const session = await auth.api.getSession({ headers });
if (!session) return unauthorized();
const rateLimit = await rateLimitService.limit(`api:${session.user.id}`);

Include standard headers in 429 responses

return new Response('Rate limited', {
status: 429,
headers: {
'X-Retry-After': String(result.retryAfter),
'X-RateLimit-Limit': String(result.limit),
'X-RateLimit-Remaining': String(result.remaining),
'X-RateLimit-Reset': String(result.resetAt),
},
});

Use appropriate windows based on use case:

  • Auth endpoints: 5 attempts / 15 minutes
  • API endpoints: 100 requests / minute
  • File uploads: 10 / minute
  • Expensive operations: 5 / hour

Performance Considerations

StorageLatencyCoordinationUse Case
Database~5-10msCross-instanceDefault, most deployments
Redis~1-2msCross-instanceHigh-traffic APIs
Memory<1msInstance-localServerless, single instance

For most SaaS applications, database storage is sufficient. Consider Redis only if:

  • Rate limiting adds measurable latency
  • You need sub-millisecond response times
  • Database is already under heavy load

Decision Rules

Use database storage when:

  • Running multiple server instances that must share rate limit state
  • Rate limit windows are 1 minute or longer
  • You don't want additional infrastructure complexity

Use Redis/secondary storage when:

  • You need <5ms rate limit checks
  • Database is already CPU-constrained
  • You're handling >1000 requests/second

Avoid memory storage:

  • Rate limits don't persist across server restarts or deployments
  • Each server instance maintains separate counters, allowing users to bypass limits by hitting different instances
  • Serverless functions create new instances frequently, resetting counters unpredictably
  • Only useful for development/testing, never for production rate limiting

If unsure: Start with database storage. In production with ~50k daily active users, database-backed rate limiting adds approximately 5-10ms per request. We haven't needed Redis until hitting 500+ concurrent requests/second.

Common Pitfalls

  • Creating service per request - Instantiating createRateLimitService() inside request handlers creates connection overhead; use a module-level singleton
  • Rate limiting before auth - Unauthenticated requests consume rate limit quota; validate authentication before checking rate limits
  • Missing retry headers - Clients don't know when to retry; always include X-Retry-After, X-RateLimit-Remaining, and X-RateLimit-Reset headers
  • Keys too broad - Using just api:upload rate limits all users together; include user/org ID in keys for per-user limits
  • Keys too narrow - Creating unique keys per endpoint variant fragments limits; group related endpoints under common keys
  • Forgetting cleanup - The rate_limit table grows indefinitely; schedule periodic cleanup of expired entries

Cleanup

The database table grows over time. Better Auth handles cleanup, but for custom implementations:

// Delete entries older than 1 hour
await db.rateLimit.deleteMany({
where: {
lastRequest: { lt: Date.now() - 3600000 }
}
});

Run periodically via cron or scheduled job.