logo
How to Implement Rate Limiting for Better Performance

How to Implement Rate Limiting for Better Performance

Dec 23, 2025

Introduction: What is Rate Limiting and Why You Need It

Imagine you run a popular website. One day, you notice your server is slowing down dramatically. Traffic is normal, but something's wrong. Investigation reveals that one bot is making 10,000 requests per second to your login page, trying to guess user passwords. Or perhaps a legitimate partner's integration is buggy and accidentally sends 1,000 requests per minute instead of 10.

Rate limiting is a security technique that controls how many requests a single client (identified by IP address, user ID, or other criteria) can make to your application within a specific time window. It's like a bouncer at a club who says, "You can enter 10 times per hour, but not 10 times per minute."

Without rate limiting:

  • Brute-force attacks succeed because attackers can try millions of password combinations

  • DDoS attacks overwhelm your server by flooding it with requests

  • Buggy integrations accidentally consume all your server resources

  • Web scrapers drain your database and bandwidth

  • API abuse prevents legitimate users from accessing your service

With rate limiting:

  • Attackers can only make limited attempts before being blocked

  • Accidental traffic spikes are handled gracefully

  • Your application remains available to legitimate users

  • You can identify and block abusive clients

Caddy makes implementing rate limiting simple with its built-in rate_limit handler.


Understanding Rate Limiting Concepts

Before diving into configuration, let's understand the key concepts:

Rate Limit Components

A rate limit rule has three essential parts:

1. Zone (Name of the Rate Limit Rule)

zone login {
    ...
}

The zone is just a name you give to the rate limit rule. You might have multiple zones: one for login attempts, one for API calls, one for downloads.

2. Key (What to Count By)

key {remote_host}

The "key" determines what identifies a client. Common options:

KeyMeaningExample
{remote_host}Client's IP address203.0.113.10
{http.request.header.X-Forwarded-For}IP from proxy header192.168.1.100
{http.request.header.Authorization}API token (for authenticated users)token-abc-123
{http.request.header.User-Agent}Browser type (less reliable)Mozilla/5.0...
{http.request.cookie.sessionid}Session cookie (per-user tracking)session-xyz-789

Example: If you rate limit by IP address (remote_host), each unique IP has its own counter. If you rate limit by user token, each API user has their own counter.

3. Events and Window (How Much and How Often)

events 100
window 1m
  • events = Number of requests allowed

  • window = Time period (1m = 1 minute, 1h = 1 hour)

Example: events 100, window 1m means "allow 100 requests per minute"

Rate Limit Behavior

When a client exceeds the rate limit, Caddy automatically responds with HTTP 429 Too Many Requests, indicating the client should slow down.


Real-World Attacks and How Rate Limiting Helps

Attack 1: Brute-Force Password Attack

An attacker tries to guess a user's password using automated attempts.

Without rate limiting:

Attacker attempts 1000 login tries per minute
Application processes all 1000 attempts
One of them might be correct
User account compromised

With rate limiting (10 attempts per minute):

Attempt 1: ✓ Allowed
Attempt 2: ✓ Allowed
...
Attempt 10: ✓ Allowed
Attempt 11: ✗ Blocked (429 Too Many Requests)
...
Result: Attacker can only try 10 times/minute instead of 1000
After 100 minutes of attempts, still haven't found password

Attack 2: DDoS (Distributed Denial of Service)

Attacker floods your server with requests from many IP addresses, making it unavailable.

With rate limiting:

Each IP address is limited to X requests per minute
If any IP exceeds limit, it's automatically blocked
Attacker needs 100x more resources to have the same effect
Cost of attack becomes prohibitive

Attack 3: Accidental Abuse

A partner's API integration has a bug and loops, sending 1,000 requests per second instead of 10.

Without rate limiting:

All 1,000 requests hit your database
Database slows down
Other users experience latency

With rate limiting:

Requests 1-100 processed
Request 101: Blocked with 429 error
Partner sees error, fixes bug
Everyone else continues using the app normally

Attack 4: Web Scraping

Someone writes a bot to scrape all your product data.

Without rate limiting:

Bot makes 100,000 requests per hour
Your bandwidth bills skyrocket
Database is stressed
Legitimate users can't access the site

With rate limiting:

Bot can make max 1,000 requests per hour (if limit is set to 1,000)
Scraping takes 100 hours instead of 1 hour
Bot probably gives up

Pattern 1: Basic Rate Limiting for Entire Domain

Apply the same rate limit to all requests to your domain.

Simple Rate Limit: 100 Requests Per Minute

example.com {
    rate_limit {
        zone general {
            key {remote_host}
            events 100
            window 1m
        }
    }

    reverse_proxy localhost:3000
}

How it works:

  1. Each client IP address has its own counter

  2. Every request increments that IP's counter

  3. Counter resets every minute

  4. If counter exceeds 100 in a minute, requests are blocked with HTTP 429

Real-world example:

Client A (203.0.113.10) makes 50 requests → Allowed (under 100)
Client B (198.51.100.20) makes 100 requests → Allowed (exactly 100)
Client C (192.0.2.30) makes 120 requests → First 100 allowed, next 20 blocked (429)
After 1 minute: All counters reset

Conservative Rate Limit: 50 Per Hour

For static content or read-heavy operations:

example.com {
    rate_limit {
        zone general {
            key {remote_host}
            events 50
            window 1h
        }
    }

    reverse_proxy localhost:3000
}

This allows 50 requests per hour per IP (about 1 request per minute on average).

Generous Rate Limit: 1000 Per Minute

For high-traffic applications or APIs:

api.example.com {
    rate_limit {
        zone api {
            key {remote_host}
            events 1000
            window 1m
        }
    }

    reverse_proxy localhost:3000
}

Different Limits Per Time Window

Combine multiple rate limit rules (short burst + long sustained):

example.com {
    rate_limit {
        zone burst {
            key {remote_host}
            events 50
            window 10s
        }
    }

    rate_limit {
        zone sustained {
            key {remote_host}
            events 1000
            window 1h
        }
    }

    reverse_proxy localhost:3000
}

This allows:

  • 50 requests per 10 seconds (prevents sudden bursts)

  • 1,000 requests per hour (reasonable daily usage)

A normal user can make their daily requests, but a bot trying to scrape everything at once is blocked.


Pattern 2: Rate Limit Specific Routes Only

Don't apply rate limiting to your entire application. Instead, protect only sensitive routes.

Rate Limit Login Endpoint

The /login endpoint is vulnerable to brute-force attacks. Protect it aggressively:

example.com {
    handle_path /login* {
        rate_limit {
            zone login {
                key {remote_host}
                events 10
                window 1m
            }
        }
        reverse_proxy localhost:3000
    }

    handle {
        reverse_proxy localhost:3000
    }
}

Logic:

  1. Requests to /login* (matches /login, /login/, /login/forgot-password, etc.):

    • Limited to 10 attempts per minute per IP

    • After 10 attempts in a minute, receive 429 error

  2. All other routes:

    • No rate limiting applied

    • Users can load the homepage, browse products, etc. normally

Real-world protection:

Attacker trying to brute-force password:
  Attempt 1-10: Processing login attempts
  Attempt 11: Blocked (429 Too Many Requests)
  Attempt 12-20: Still blocked
  Result: Can only try 10 passwords per minute

Rate Limit Password Reset Endpoint

Prevent abuse of password reset functionality:

example.com {
    handle_path /api/auth/password-reset {
        rate_limit {
            zone password_reset {
                key {remote_host}
                events 5
                window 1h
            }
        }
        reverse_proxy localhost:3000
    }

    handle {
        reverse_proxy localhost:3000
    }
}

This allows only 5 password reset requests per hour per IP, preventing:

  • Spam (someone resetting passwords for accounts they don't own)

  • Account enumeration (discovering which emails are registered)

Rate Limit API Endpoints

Protect your API from abuse while allowing normal usage:

api.example.com {
    handle_path /api/v1/* {
        rate_limit {
            zone api_general {
                key {remote_host}
                events 1000
                window 1m
            }
        }
        reverse_proxy localhost:3001
    }

    handle {
        reverse_proxy localhost:3001
    }
}

This limits API clients to 1,000 requests per minute (about 16 per second), which is reasonable for most APIs.

Rate Limit File Downloads

Prevent users from downloading massive amounts of files:

example.com {
    handle_path /download/* {
        rate_limit {
            zone downloads {
                key {remote_host}
                events 20
                window 1h
            }
        }
        reverse_proxy localhost:3000
    }

    handle {
        reverse_proxy localhost:3000
    }
}

Allows 20 downloads per hour per IP (about 1 every 3 minutes).


Pattern 3: Rate Limiting by Different Keys

Instead of identifying clients by IP, identify them by user account, API token, or other criteria.

Rate Limit by API Token (Per User)

When users authenticate with an API token, rate limit them individually:

api.example.com {
    rate_limit {
        zone api_user {
            key {http.request.header.Authorization}
            events 1000
            window 1m
        }
    }

    reverse_proxy localhost:3001
}

Benefits:

  • Each API user has their own quota

  • Even if one user's IP makes 1,000 requests, they only get their allocation

  • Multiple users from the same office (same IP) don't interfere with each other

  • Users on mobile networks (changing IPs) aren't penalized

Rate Limit by Session Cookie (Per User)

For web applications where users are logged in via session cookies:

example.com {
    handle_path /api/* {
        rate_limit {
            zone api_session {
                key {http.request.cookie.sessionid}
                events 500
                window 1m
            }
        }
        reverse_proxy localhost:3000
    }

    handle {
        reverse_proxy localhost:3000
    }
}

Benefit: Anonymous users share a limit, but authenticated users have individual limits.

Rate Limit by Custom Header

If clients send a custom identifier header:

api.example.com {
    rate_limit {
        zone api_client {
            key {http.request.header.X-Client-ID}
            events 500
            window 1m
        }
    }

    reverse_proxy localhost:3001
}

Pattern 4: Tiered Rate Limiting

Different rules for different types of clients.

Premium vs. Free Users

Premium users get higher limits:

api.example.com {
    # Premium users (identified by special token)
    handle {
        @premium {
            header X-Account-Type premium
        }

        handle @premium {
            rate_limit {
                zone premium {
                    key {http.request.header.X-API-Key}
                    events 10000
                    window 1m
                }
            }
            reverse_proxy localhost:3001
        }

        # Free users (default)
        rate_limit {
            zone free {
                key {http.request.header.X-Client-ID}
                events 100
                window 1m
            }
        }
        reverse_proxy localhost:3001
    }
}

Tiers:

  • Premium users: 10,000 requests/minute

  • Free users: 100 requests/minute

Admin vs. Regular Users

Admins bypass rate limiting:

example.com {
    @admin {
        header X-User-Role admin
    }

    handle @admin {
        # No rate limiting for admins
        reverse_proxy localhost:3000
    }

    # Regular users
    rate_limit {
        zone general {
            key {remote_host}
            events 500
            window 1m
        }
    }

    handle {
        reverse_proxy localhost:3000
    }
}

Real-World Scenarios

Scenario 1: E-Commerce Application

shop.example.com {
    # Aggressive rate limit on login (prevent credential stuffing)
    handle_path /login {
        rate_limit {
            zone login {
                key {remote_host}
                events 5
                window 1m
            }
        }
        reverse_proxy localhost:3000
    }

    # Aggressive on password reset (prevent enumeration)
    handle_path /forgot-password {
        rate_limit {
            zone password_reset {
                key {remote_host}
                events 3
                window 1h
            }
        }
        reverse_proxy localhost:3000
    }

    # Moderate on API (prevent scraping)
    handle_path /api/* {
        rate_limit {
            zone api {
                key {remote_host}
                events 500
                window 1m
            }
        }
        reverse_proxy localhost:3000
    }

    # Normal browsing unrestricted
    handle {
        reverse_proxy localhost:3000
    }
}

Scenario 2: Public API Service

api.example.com {
    # Unauthenticated requests (strict limit)
    handle_path /api/v1/* {
        @unauthenticated {
            header_regexp Authorization ^$
        }

        handle @unauthenticated {
            rate_limit {
                zone public {
                    key {remote_host}
                    events 100
                    window 1m
                }
            }
            reverse_proxy localhost:3001
        }

        # Authenticated requests (generous limit)
        rate_limit {
            zone authenticated {
                key {http.request.header.Authorization}
                events 5000
                window 1m
            }
        }
        reverse_proxy localhost:3001
    }

    handle {
        reverse_proxy localhost:3001
    }
}

Scenario 3: Multi-Tenant SaaS

app.example.com {
    # Each tenant gets their own rate limit based on their API key
    rate_limit {
        zone tenant {
            key {http.request.header.X-Tenant-Key}
            events 10000
            window 1h
        }
    }

    # Burst protection (no more than 500/minute from one tenant)
    rate_limit {
        zone tenant_burst {
            key {http.request.header.X-Tenant-Key}
            events 500
            window 1m
        }
    }

    reverse_proxy localhost:3000
}

Customizing Error Responses

When a client exceeds the rate limit, they get a 429 error. You can customize the message:

Returning a JSON Error Response

For APIs, return JSON:

api.example.com {
    rate_limit {
        zone api {
            key {remote_host}
            events 100
            window 1m
        }
    }

    @ratelimit_error {
        response_header -X-Rate-Limit-Remaining
    }

    handle @ratelimit_error {
        respond `{"error":"Rate limit exceeded. Maximum 100 requests per minute."}` 429 {
            header Content-Type application/json
        }
    }

    reverse_proxy localhost:3001
}

Custom HTML Page

For websites, return HTML:

example.com {
    rate_limit {
        zone general {
            key {remote_host}
            events 100
            window 1m
        }
    }

    respond `<!DOCTYPE html>
<html>
<head>
    <title>Rate Limit Exceeded</title>
</head>
<body>
    <h1>Too Many Requests</h1>
    <p>You've exceeded the rate limit. Please wait before making more requests.</p>
</body>
</html>` 429 {
        header Content-Type text/html
    }
}

Handling Legitimate Traffic Spikes

Sometimes legitimate traffic exceeds your rate limit (news article, social media viral moment). Handle this gracefully:

Burst-Friendly Limits

Allow short bursts while enforcing sustained limits:

example.com {
    # Allow bursts (1000 in 10 seconds)
    rate_limit {
        zone burst {
            key {remote_host}
            events 1000
            window 10s
        }
    }

    # Enforce sustained rate (2000 per hour)
    rate_limit {
        zone sustained {
            key {remote_host}
            events 2000
            window 1h
        }
    }

    reverse_proxy localhost:3000
}

A normal user doing legitimate activity (fast page loads, form submissions) won't hit the burst limit. A bot scraping continuously will hit the sustained limit.

Increased Limits During High-Traffic Events

Temporarily increase limits during anticipated high-traffic periods:

# Normal operation
example.com {
    rate_limit {
        zone general {
            key {remote_host}
            events 500
            window 1m
        }
    }

    reverse_proxy localhost:3000
}

# During Black Friday / special events, increase limits
# (You'd temporarily change this)
example.com {
    rate_limit {
        zone general {
            key {remote_host}
            events 2000        # 4x increase
            window 1m
        }
    }

    reverse_proxy localhost:3000
}

Common Mistakes to Avoid

Mistake 1: Rate Limiting by IP When Behind a Proxy

If your traffic goes through a load balancer, reverse proxy, or CDN, all traffic appears to come from the proxy's IP, not the user's real IP.

# ❌ WRONG - Behind Cloudflare, all users appear as Cloudflare's IP
rate_limit {
    zone api {
        key {remote_host}
        events 100
        window 1m
    }
}

Result: All users share one limit. After one user makes 100 requests, everyone else is blocked.

Solution: Use the X-Forwarded-For header (from proxy) instead:

# ✅ CORRECT
rate_limit {
    zone api {
        key {http.request.header.X-Forwarded-For}
        events 100
        window 1m
    }
}

Or for Cloudflare:

# ✅ CORRECT for Cloudflare
rate_limit {
    zone api {
        key {http.request.header.CF-Connecting-IP}
        events 100
        window 1m
    }
}

Mistake 2: Setting Limits Too Strict

If your limits are too aggressive, legitimate users get blocked:

# ❌ WRONG - Too strict
rate_limit {
    zone api {
        key {remote_host}
        events 5          # Only 5 requests per minute
        window 1m
    }
}

A user paginating through results (5 clicks) instantly hits the limit.

Solution: Set reasonable limits based on expected usage:

# ✅ CORRECT
rate_limit {
    zone api {
        key {remote_host}
        events 1000       # Much more reasonable
        window 1m
    }
}

Mistake 3: Forgetting to Rate Limit Sensitive Endpoints

Leaving login, password reset, or API endpoints unprotected:

# ❌ WRONG - No rate limit on sensitive routes
example.com {
    handle_path /login {
        reverse_proxy localhost:3000
    }

    handle {
        reverse_proxy localhost:3000
    }
}

Solution: Explicitly add rate limits to sensitive routes:

# ✅ CORRECT
example.com {
    handle_path /login {
        rate_limit {
            zone login {
                key {remote_host}
                events 10
                window 1m
            }
        }
        reverse_proxy localhost:3000
    }

    handle {
        reverse_proxy localhost:3000
    }
}

Mistake 4: Using Inconsistent Time Windows

Mixing different time windows makes rules confusing:

# ❌ CONFUSING
rate_limit {
    zone quick {
        key {remote_host}
        events 50
        window 30s
    }
}

rate_limit {
    zone medium {
        key {remote_host}
        events 100
        window 2m
    }
}

rate_limit {
    zone long {
        key {remote_host}
        events 1000
        window 3h
    }
}

Solution: Use standard time windows (10s, 1m, 1h, 1d):

# ✅ CLEAR AND UNDERSTANDABLE
rate_limit {
    zone burst {
        key {remote_host}
        events 100
        window 10s       # Burst limit
    }
}

rate_limit {
    zone sustained {
        key {remote_host}
        events 1000
        window 1m        # Sustained rate
    }
}

Mistake 5: Not Testing Before Deploying

Deploy rate limiting in production without testing, and you might accidentally block all legitimate traffic:

# ❌ WRONG - Deploy without testing limits
rate_limit {
    zone api {
        key {remote_host}
        events 10
        window 1m
    }
}

Solution: Test thoroughly before deploying:

# Test from multiple IPs
for i in {1..15}; do
    curl https://api.example.com/endpoint
    echo "Request $i"
    sleep 1
done

# Request 11-15 should fail with 429

Monitoring and Debugging

Check Rate Limit Status

Enable Caddy logging to see rate limit behavior:

example.com {
    log {
        output stdout
        format json
    }

    rate_limit {
        zone api {
            key {remote_host}
            events 100
            window 1m
        }
    }

    reverse_proxy localhost:3000
}

View logs:

# If running as service
sudo journalctl -u caddy -f

# If running in Docker
docker logs -f caddy-container

Track Rate Limit Headers

Caddy doesn't include rate limit headers by default, but you can add them:

example.com {
    header {
        X-RateLimit-Limit "100"
        X-RateLimit-Remaining "{http.vars.rate_limit_remaining}"
        X-RateLimit-Reset "{http.vars.rate_limit_reset}"
    }

    rate_limit {
        zone api {
            key {remote_host}
            events 100
            window 1m
        }
    }

    reverse_proxy localhost:3000
}

Clients can then check remaining requests before hitting the limit.


Complete Production-Ready Examples

Full-Featured E-Commerce Setup

shop.example.com {
    # Aggressive rate limit on login (prevent credential stuffing)
    handle_path /login {
        rate_limit {
            zone login {
                key {remote_host}
                events 10
                window 1m
            }
        }
        reverse_proxy localhost:3000
    }

    # Strict on password reset (prevent enumeration)
    handle_path /api/auth/password-reset {
        rate_limit {
            zone password_reset {
                key {remote_host}
                events 5
                window 1h
            }
        }
        reverse_proxy localhost:3000
    }

    # Moderate on checkout (prevent fraud)
    handle_path /api/checkout {
        rate_limit {
            zone checkout {
                key {remote_host}
                events 20
                window 1h
            }
        }
        reverse_proxy localhost:3000
    }

    # Reasonable API limit (prevent scraping)
    handle_path /api/* {
        rate_limit {
            zone api {
                key {remote_host}
                events 500
                window 1m
            }
        }
        reverse_proxy localhost:3000
    }

    # Browsing unrestricted
    handle {
        reverse_proxy localhost:3000
    }
}

Tiered API Service

api.example.com {
    # Block known abusers first
    @attackers {
        remote_ip 192.0.2.0/24
    }

    handle @attackers {
        respond "Access denied" 403
    }

    # Unauthenticated public API (strict)
    handle_path /api/v1/public/* {
        rate_limit {
            zone public {
                key {remote_host}
                events 100
                window 1m
            }
        }
        reverse_proxy localhost:3001
    }

    # Authenticated API (generous)
    handle_path /api/v1/* {
        rate_limit {
            zone authenticated {
                key {http.request.header.X-API-Key}
                events 5000
                window 1m
            }
        }
        reverse_proxy localhost:3001
    }

    # Admin endpoints (no limit, but must authenticate)
    handle_path /api/admin/* {
        @admin {
            header X-User-Role admin
        }

        handle @admin {
            reverse_proxy localhost:3002
        }

        respond "Admin access required" 401
    }

    respond "Not found" 404
}

Key Takeaways

  1. Rate limiting is essential for preventing brute-force attacks, DDoS, and accidental abuse.

  2. Protect sensitive endpoints (login, password reset, API) more aggressively than public routes.

  3. Understand the difference between keys:

    • remote_host (IP address) - Simple but breaks behind proxies

    • API tokens - Better for identifying individual users

    • Session cookies - Good for web apps with user accounts

  4. Test before deploying to avoid blocking legitimate traffic.

  5. Use appropriate time windows:

    • 10-30 seconds for burst limits

    • 1 minute for API rate limits

    • 1 hour for sensitive operations like password reset

  6. Handle proxies correctly - Use X-Forwarded-For or provider-specific headers.

  7. Monitor rate limit behavior via logs to identify legitimate issues vs. actual attacks.

  8. Be generous on legitimate routes - Only be strict on endpoints that are attack targets.

  9. Combine with other security - Rate limiting alone isn't enough; pair with authentication, input validation, and IP filtering.

  10. Document your limits - Keep comments explaining why each limit is set to its value.


Further Reading

How to Implement Rate Limiting for Better Performance | Softcolon Blog