Rate Limiting
Request limits and quotas.
Rate Limit Headers
Every response includes rate limit information:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 99
X-RateLimit-Reset: 1702837200
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests per window |
X-RateLimit-Remaining | Remaining requests in window |
X-RateLimit-Reset | Unix timestamp when limit resets |
Default Limits
| Plan | Requests/min | Requests/day |
|---|---|---|
| Free | 10 | 100 |
| Starter | 60 | 1,000 |
| Pro | 300 | 10,000 |
| Enterprise | Custom | Custom |
Rate Limit Exceeded
When you exceed the limit:
Status: 429 Too Many Requests
{
"error": "rate_limit_exceeded",
"message": "Rate limit exceeded. Try again in 60 seconds.",
"details": {
"limit": 100,
"remaining": 0,
"reset": 1702837200,
"retry_after": 60
},
"status": "error"
}
Best Practices
Implement Exponential Backoff
async function callWithRetry(fn, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await fn();
} catch (error) {
if (error.status === 429 && i < maxRetries - 1) {
const retryAfter = error.details?.retry_after || Math.pow(2, i);
await sleep(retryAfter * 1000);
continue;
}
throw error;
}
}
}
Monitor Rate Limit Headers
const response = await fetch('/api/v3/llm/call', options);
const remaining = response.headers.get('X-RateLimit-Remaining');
const reset = response.headers.get('X-RateLimit-Reset');
if (remaining < 10) {
console.warn(`Low rate limit: ${remaining} requests remaining`);
}
Use Request Queuing
For high-volume applications, implement a queue to smooth out request bursts:
class RequestQueue {
constructor(maxPerSecond = 10) {
this.queue = [];
this.processing = false;
this.interval = 1000 / maxPerSecond;
}
async add(request) {
return new Promise((resolve, reject) => {
this.queue.push({ request, resolve, reject });
this.process();
});
}
async process() {
if (this.processing || this.queue.length === 0) return;
this.processing = true;
const { request, resolve, reject } = this.queue.shift();
try {
const result = await request();
resolve(result);
} catch (error) {
reject(error);
}
setTimeout(() => {
this.processing = false;
this.process();
}, this.interval);
}
}
Quota Management
Monthly quotas are tracked separately from rate limits.
Check Quota
curl https://llm.zihin.ai/api/quota \
-H "X-Api-Key: zhn_live_xxxxx"
Response:
{
"success": true,
"quota": {
"limit": 10000,
"used": 2500,
"remaining": 7500,
"reset_date": "2025-02-01T00:00:00.000Z"
}
}
Quota Exceeded
{
"error": "quota_exceeded",
"message": "Monthly quota exceeded",
"details": {
"limit": 10000,
"used": 10000,
"reset_date": "2025-02-01T00:00:00.000Z"
},
"status": "error"
}