Rate Limiting
Rate limit headers, per-endpoint limits, and retry strategies.
2 min read
Every API response includes rate limit headers so you can track your usage and back off before hitting 429s.
Headers#
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests in the current window |
X-RateLimit-Remaining | Requests remaining in the current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
Retry-After | Seconds to wait (only on 429 responses) |
Limits by Endpoint#
| Endpoint Group | Limit | Scope |
|---|---|---|
POST /chat | 5 per 30 sec | Per user |
| Conversation creation | 10 per minute | Per user |
| Key management | 10 per minute | Per user |
| Deck mutations | 30 per minute | Per user |
| Deck reads/export | 30 per minute | Per user |
| Card imports | 5 per minute | Per user |
| Progress reads | 30 per minute | Per user |
| Curriculum reads | 30 per minute | Per user |
| Exercise generation | 5 per 5 minutes | Per user |
| Pronunciation scoring | 10 per minute | Per user |
| Webhook operations | 10 per minute | Per user |
Retry Strategy#
When you receive a 429 Too Many Requests response:
- Read the
Retry-Afterheader (seconds until you can retry) - Wait that duration before retrying
- If no
Retry-Afterheader, use exponential backoff starting at 1 second
TypeScript
async function fetchWithRetry(url: string, options: RequestInit, maxRetries = 3) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const response = await fetch(url, options);
if (response.status === 429) {
const retryAfter = response.headers.get("Retry-After");
const waitMs = retryAfter
? parseInt(retryAfter) * 1000
: Math.pow(2, attempt) * 1000;
await new Promise((r) => setTimeout(r, waitMs));
continue;
}
return response;
}
throw new Error("Max retries exceeded");
}
Tip
Read X-RateLimit-Remaining proactively. If it drops below 2, slow down your requests before hitting the limit.