Rate Limits
Per-endpoint rate limits, response headers, and how to handle 429 Too Many Requests correctly.
Kasar applies rate limits on sensitive endpoints to prevent abuse: credential brute-force, email spam, account enumeration, and runaway LLM cost. Limits are enforced with a sliding window stored in Redis and counted per the identity key documented below.
Rate limiting is applied in addition to authentication. An authenticated token is never immune to rate limits.
Response Headers
Every rate-limited response — whether allowed or denied — carries standard headers so you can adapt your client behavior:
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum number of requests allowed within the window |
X-RateLimit-Remaining | Requests still allowed in the current window |
X-RateLimit-Reset | Epoch seconds (UTC) when the window resets |
X-RateLimit-Policy | Identifier of the policy that was applied (e.g. auth:magic-link) |
Retry-After | Only on 429: seconds to wait before retrying |
429 Response Body
When a request is denied, the API returns HTTP 429 Too Many Requests:
{
"error": "Too many requests",
"policy": "auth:magic-link",
"retryAfterSeconds": 420
}Respect the Retry-After header. Implement exponential backoff for automated clients, and surface a clear message (e.g. "Please wait a few minutes before trying again") to end users.
Current Policies
Each policy below applies to a specific endpoint or family of endpoints. The identity key column describes which attribute the limit is scoped to — for example, IP + email means two different emails from the same IP share no bucket, and the same email from two different IPs share no bucket either.
Authentication
| Policy | Endpoint | Limit | Window | Identity key |
|---|---|---|---|---|
auth:magic-link | POST /api/auth/magic-link | 15 | 10 min | IP + email |
auth:signup | POST /api/auth/signup | 10 | 1 hour | IP |
auth:verify-token | GET /api/auth/verify-token | 20 | 10 min | IP + email |
auth:invite-verify | POST /api/auth/invite/verify | 20 | 10 min | IP |
Why these limits?
auth:magic-link— Prevents an attacker from spamming your users' inboxes or enumerating which emails have an account.auth:signup— Prevents bulk bot-driven organization creation.auth:verify-token— Caps brute-force attempts on verification tokens. Legitimate users only consume the link once.auth:invite-verify— Caps brute-force on invitation tokens.
AI & LLM
| Policy | Endpoint | Limit | Window | Identity key |
|---|---|---|---|---|
agent:stream | POST /api/agent/stream | 120 | 1 hour | Authenticated user id |
Why this limit? Each streaming run triggers Claude API calls whose cost scales with token usage. A capped per-user ceiling prevents a single compromised session from running up a large bill.
Handling 429 in Your Integration
TypeScript example:
async function callKasar(path: string, init: RequestInit, attempt = 0): Promise<Response> {
const res = await fetch(`https://kasar.app${path}`, init)
if (res.status === 429 && attempt < 3) {
const retryAfter = Number(res.headers.get('Retry-After') ?? 60)
await new Promise((r) => setTimeout(r, retryAfter * 1000))
return callKasar(path, init, attempt + 1)
}
return res
}Python example:
import time
import requests
def call_kasar(path, attempt=0, **kwargs):
res = requests.request(url=f"https://kasar.app{path}", **kwargs)
if res.status_code == 429 and attempt < 3:
retry_after = int(res.headers.get("Retry-After", 60))
time.sleep(retry_after)
return call_kasar(path, attempt + 1, **kwargs)
return resOperational Notes
- Fail-open on infrastructure issues. If the Redis backend is unreachable, requests are allowed through rather than blocked. Rate limiting is a defense-in-depth measure and never becomes a single point of failure for authentication.
- Sliding window. The window slides with each request — not a fixed calendar bucket. A request made at
t=0ages out exactlywindowSeclater. - Policy IDs are stable. The
X-RateLimit-Policyheader value is part of the public contract. If you log it, you can rely on the identifier not changing.
Requesting Higher Limits
If your integration has a legitimate need for a higher ceiling (dedicated tenant, enterprise workflow), contact support with:
- The policy name (
X-RateLimit-Policy) - The expected peak request rate
- A short description of the integration
We review requests case-by-case.