Rate Limit Service

Database-backed rate limiting with pluggable storage backends for the Next.js Prisma SaaS Kit

The rate limit service provides sliding-window rate limiting with database, Redis, or memory backends. Use createRateLimitService() to create a singleton, then call limit(key, {windowSeconds, max}) to check if a request should be allowed.

This guide is part of the Database Configuration documentation.

Sliding window rate limiting is an algorithm that limits requests within a moving time window, providing smoother traffic control than fixed windows. It tracks request counts per key and resets the counter when the window expires.

Rate Limit Service

Implement rate limiting in your application.

Basic Usage

Storage Backends

Key Naming Conventions

Best Practices

Performance Considerations

Overview

The rate limit service provides configurable rate limiting using a sliding window algorithm. It supports multiple storage backends:

Database (default, recommended) - Persistent, works across server instances
Secondary Storage - Redis/KV stores with TTL support for high-traffic APIs
Memory - Instance-local only; not recommended for production (limits reset on restart and don't sync across instances)

Usage

Basic Usage

import { createRateLimitService } from '@kit/database';
const rateLimitService = createRateLimitService();
const result = await rateLimitService.limit('upload:image:user123', {
  windowSeconds: 60,
  max: 10,
});
if (!result.success) {
  return new Response('Rate limited', {
    status: 429,
    headers: { 'X-Retry-After': String(result.retryAfter) },
  });
}

Response Shape

interface RateLimitDecision {
  success: boolean;      // Whether request is allowed
  remaining: number;     // Requests remaining in window
  limit: number;         // Max requests per window
  resetAt: number;       // Timestamp when window resets
  retryAfter: number | null; // Seconds until retry (if blocked)
  key: string;           // The rate limit key
}

Storage Backends

Database (Default)

Uses PostgreSQL via Prisma. Schema auto-generated by Better Auth.

const service = createRateLimitService(); // Uses default db
// or
const service = createRateLimitService({ database: customDb });

Secondary Storage (Redis/KV)

For external stores like Redis, Upstash, or Vercel KV:

import { createRateLimitService, createSecondaryRateLimitStorageFactory } from '@kit/database';
const storage = createSecondaryRateLimitStorageFactory({
  get: (key) => redis.get(key),
  set: (key, value, ttlSeconds) => redis.set(key, value, { ex: ttlSeconds }),
});
const service = createRateLimitService({ storage });

Key Naming Conventions

Use namespaced keys to separate rate limits:

Use Case	Key Format	Example
API endpoint	`api:{endpoint}:{userId}`	`api:upload:user_123`
Auth action	`auth:{action}:{identifier}`	`auth:login:user@example.com`
Feature	`feature:{name}:{userId}`	`feature:export:user_123`

Configuration

Environment Variables

Variable	Description	Default
`BETTER_AUTH_RATE_LIMIT_STORAGE`	Storage backend: `database`, `memory`, `secondary-storage`	`database`
`UPLOAD_RATE_LIMIT_MAX`	Max uploads per window	`10`
`UPLOAD_RATE_LIMIT_WINDOW`	Window in seconds	`60`

Implementation Details

Atomic Operations

The database service uses a single upsert query:

INSERT INTO rate_limit (id, key, count, last_request)
VALUES ($id, $key, 1, $now)
ON CONFLICT (id) DO UPDATE SET
  count = CASE
    WHEN last_request + $windowMs < $now THEN 1
    ELSE count + 1
  END,
  last_request = $now
WHERE (last_request + $windowMs) <= $now OR count < $max
RETURNING count, last_request;

This ensures:

No race conditions
Single database round-trip
Automatic window reset

Schema

CREATE TABLE "rate_limit" (
  "id" text PRIMARY KEY NOT NULL,
  "key" text,
  "count" integer,
  "last_request" bigint
);

The id column IS the rate limit key (e.g., upload:image:user123).

Best Practices

Create singleton at module level - Avoid creating service per request

// Good
const rateLimitService = createRateLimitService();
export async function handler() { ... }
// Bad
export async function handler() {
  const service = createRateLimitService(); // New instance per request
}

Rate limit after auth - Don't consume rate limit for unauthenticated requests

const session = await auth.api.getSession({ headers });
if (!session) return unauthorized();
const rateLimit = await rateLimitService.limit(`api:${session.user.id}`);

Include standard headers in 429 responses

return new Response('Rate limited', {
  status: 429,
  headers: {
    'X-Retry-After': String(result.retryAfter),
    'X-RateLimit-Limit': String(result.limit),
    'X-RateLimit-Remaining': String(result.remaining),
    'X-RateLimit-Reset': String(result.resetAt),
  },
});

Use appropriate windows based on use case:

Auth endpoints: 5 attempts / 15 minutes
API endpoints: 100 requests / minute
File uploads: 10 / minute
Expensive operations: 5 / hour

Performance Considerations

Storage	Latency	Coordination	Use Case
Database	~5-10ms	Cross-instance	Default, most deployments
Redis	~1-2ms	Cross-instance	High-traffic APIs
Memory	<1ms	Instance-local	Serverless, single instance

For most SaaS applications, database storage is sufficient. Consider Redis only if:

Rate limiting adds measurable latency
You need sub-millisecond response times
Database is already under heavy load

Decision Rules

Use database storage when:

Running multiple server instances that must share rate limit state
Rate limit windows are 1 minute or longer
You don't want additional infrastructure complexity

Use Redis/secondary storage when:

You need <5ms rate limit checks
Database is already CPU-constrained
You're handling >1000 requests/second

Avoid memory storage:

Rate limits don't persist across server restarts or deployments
Each server instance maintains separate counters, allowing users to bypass limits by hitting different instances
Serverless functions create new instances frequently, resetting counters unpredictably
Only useful for development/testing, never for production rate limiting

If unsure: Start with database storage. In production with ~50k daily active users, database-backed rate limiting adds approximately 5-10ms per request. We haven't needed Redis until hitting 500+ concurrent requests/second.

Common Pitfalls

Creating service per request - Instantiating createRateLimitService() inside request handlers creates connection overhead; use a module-level singleton
Rate limiting before auth - Unauthenticated requests consume rate limit quota; validate authentication before checking rate limits
Missing retry headers - Clients don't know when to retry; always include X-Retry-After, X-RateLimit-Remaining, and X-RateLimit-Reset headers
Keys too broad - Using just api:upload rate limits all users together; include user/org ID in keys for per-user limits
Keys too narrow - Creating unique keys per endpoint variant fragments limits; group related endpoints under common keys
Forgetting cleanup - The rate_limit table grows indefinitely; schedule periodic cleanup of expired entries

Cleanup

The database table grows over time. Better Auth handles cleanup, but for custom implementations:

// Delete entries older than 1 hour
await db.rateLimit.deleteMany({
  where: {
    lastRequest: { lt: Date.now() - 3600000 }
  }
});

Run periodically via cron or scheduled job.