Skip to main content

Rate Limiting

Psychic does not bundle rate-limiting. Neither does Koa ("Koa does not bundle any middleware within its core") nor socket.io. This is a deliberate choice. Rate-limiting has strong dependencies on deployment topology — multi-node Redis versus single-process memory versus an upstream edge — and shipping a specific implementation inside the framework would either constrain those choices or be the first thing every serious deployment replaces. The right mental model is defense in depth at multiple layers.

This guide covers both layers: the edge-tier protection almost every production app should put first, and the app-tier middleware pattern for per-route and per-user limits that edge devices cannot express.

Layer 1: Edge and infrastructure (primary defense)

This is where almost every production app should put its first line of defense. Traffic you reject at the edge never reaches your Node process. Connection-exhaustion attacks (SlowLoris, SYN floods, opportunistic bot sweeps) can only be solved at the edge — by the time packets reach Node the damage is already done.

The categories, in the order most teams encounter them:

  • Reverse proxy. nginx limit_req_zone and limit_conn_zone directives cap requests-per-second and concurrent connections per client IP. HAProxy has stick-table with equivalent semantics. This is the default tool if you terminate TLS yourself.
  • CDN-tier rate limits. Cloudflare Rate Limiting, Fastly Rate Limiting, and Vercel Edge Config rules reject abusive traffic before it hits your origin. If you are already behind one of these, configure the rules — you are paying for this capability.
  • Cloud WAF. AWS WAF rate-based rules, Google Cloud Armor, and Azure WAF let you write per-path, per-method, per-header rules at L7 without touching your app.
  • API gateway throttling. AWS API Gateway throttling, Kong's rate-limiting plugin, and similar gateway tools express per-API-key quotas and burst caps. Useful when you are fronting a public API with plan tiers.
# nginx example: 10 req/s per IP with a 20-request burst, plus 50 concurrent connections per IP.
limit_req_zone $binary_remote_addr zone=api_rps:10m rate=10r/s;
limit_conn_zone $binary_remote_addr zone=api_conn:10m;

server {
location / {
limit_req zone=api_rps burst=20 nodelay;
limit_conn api_conn 50;
proxy_pass http://psychic_upstream;
}
}

Be honest about your deployment. Apps behind Cloudflare or Fastly get a lot of this for free and just need the rules turned on. Apps running on a bare ALB + ECS or an equivalent cloud-VM-plus-load-balancer setup with no WAF get none of it; consider the WAF add-on before relying on app-layer middleware alone.

Layer 2: App-tier middleware (fine-grained limits)

App-tier rate limits are for the rules that need application knowledge: "login: 5 attempts per 15 minutes per IP", "send-verification-email: 1 per 60 seconds per user", "bulk-export: 3 per hour per account". Edge layers cannot express these without leaking app logic upstream.

Do not hand-roll a Map-based in-process limiter. In-process counters do not coordinate across pods — any multi-node deployment needs a shared backend (Redis). Consult current best practices to pick a well-maintained package with a Redis backend; the ecosystem moves fast and the right choice today may not be the right choice in a year.

Mounting in Psychic

Mount a rate-limiting middleware app-wide through psy.use(...) for a baseline per-IP cap, typically in an initializer:

import { PsychicApp } from '@rvoh/psychic'
import { rateLimit } from './middleware/rateLimit.js'

export default (psy: PsychicApp) => {
psy.use(rateLimit)
}

Or apply it per-endpoint with @BeforeAction on a specific controller when you need a tighter rule on a sensitive path:

import { BeforeAction } from '@rvoh/psychic'
import { HttpStatusTooManyRequests } from '@rvoh/psychic/errors'
import AuthedController from './AuthedController.js'

export default class SessionsController extends AuthedController {
@BeforeAction({ only: ['create'] })
public async throttleLogin() {
// consume from your chosen rate limiter, keyed on e.g. `${this.ctx.ip}:${email}`
// on limit-reached: set Retry-After header, then throw:
// throw new HttpStatusTooManyRequests({ error: 'too many login attempts', retryAfter })
}

public async create() {
// ... normal login flow
}
}

AuthedController here is the scaffold-local base class generated by create-psychic, not a framework export — import it from your own controllers directory. HttpStatusTooManyRequests lives in @rvoh/psychic/errors. Always set a Retry-After header before throwing.

Rate-limiting WebSockets

socket.io exposes two hook points for applying rate limits:

  • io.engine.use(middleware) — standard Express-style middleware on the HTTP upgrade request. This is the right place to rate-limit handshakes per IP, before a WebSocket connection is ever established.
  • namespace.use((socket, next) => ...) — per-connection middleware that runs after the transport is live. Use it to rate-limit message dispatch or to attach a per-user limiter to the socket.

Apply your chosen rate-limiter at io.engine.use() for handshake-level protection. For per-message limits, attach a per-user limiter in the namespace middleware and consume inside the event handler.

socket.io CORS caveat

Adjacent but distinct from rate-limiting, and worth flagging because readers looking for "WebSocket abuse prevention" often land here: socket.io's cors.origin option only applies to HTTP long-polling. Native WebSocket upgrade requests are not subject to browser CORS and cors.origin does not block them. Cross-transport origin rejection happens in socket.io's allowRequest hook, where you inspect req.headers.origin and decide whether to accept the handshake. If you only configured cors.origin and expected it to cover WebSocket traffic, you are not getting what you thought you were getting.

What signals to rate-limit

  • IP address. Cheap default. Weak against NAT and CGNAT (mobile carriers often share a single egress IP across thousands of users), and trivially bypassed by an attacker with a proxy pool. Still worth having as a baseline.
  • Authenticated user identity. Stronger for post-login abuse — "one user cannot call this endpoint more than X times". Combine with IP for defense in depth.
  • Route + method + identity tuple. Best blast-radius scoping for sensitive endpoints: login, password reset, 2FA verify, payment, outbound email, invite sending.
  • Cost-based limits. Some endpoints deserve higher weights — pagination with a large limit costs more than a point GET, a bulk export costs more than a single write. Many packages support this via a "consume N points" signature, so one heavy request can drain the same budget as multiple cheap ones.

What NOT to try to solve at the app layer

  • Connection-exhaustion DoS, SlowLoris, amplification, packet-level floods. By the time these reach the Node event loop the damage is done. Edge and L4 are the only answers.
  • Rate-limiting the health-check endpoint. Let your orchestrator (ECS, Kubernetes, Nomad) and monitoring hit /health_check freely. Excluding specific routes from app-tier limits is a normal and expected pattern — the limiter should be opt-in per route or wrapped in a skip list.

Deployment checklist

  • Edge-tier rate limit configured (WAF, CDN, or reverse proxy)?
  • Login, password-reset, and 2FA endpoints protected at the app tier with per-IP and per-account limits?
  • Heavy or costly endpoints (bulk operations, large pagination, outbound email) covered with cost-weighted limits?
  • App-tier rate limiter using a Redis backend for multi-node deployments (not in-process memory)?
  • Health-check endpoint excluded from app-tier limits?
  • /ws (or wherever socket.io is mounted) handshake rate-limited separately from HTTP routes via io.engine.use()?
  • socket.io allowRequest configured if you need cross-transport origin enforcement (see the CORS caveat above)?

For related configuration context, see psychic config and the deployment guides.