Skip to main content

Rate Limiting

Request limits and quotas.

Rate Limit Headers

Every response includes rate limit information:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 99
X-RateLimit-Reset: 1702837200
HeaderDescription
X-RateLimit-LimitMaximum requests per window
X-RateLimit-RemainingRemaining requests in window
X-RateLimit-ResetUnix timestamp when limit resets

Default Limits

PlanRequests/minRequests/day
Free10100
Starter601,000
Pro30010,000
EnterpriseCustomCustom

Rate Limit Exceeded

When you exceed the limit:

Status: 429 Too Many Requests

{
"error": "rate_limit_exceeded",
"message": "Rate limit exceeded. Try again in 60 seconds.",
"details": {
"limit": 100,
"remaining": 0,
"reset": 1702837200,
"retry_after": 60
},
"status": "error"
}

Best Practices

Implement Exponential Backoff

async function callWithRetry(fn, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await fn();
} catch (error) {
if (error.status === 429 && i < maxRetries - 1) {
const retryAfter = error.details?.retry_after || Math.pow(2, i);
await sleep(retryAfter * 1000);
continue;
}
throw error;
}
}
}

Monitor Rate Limit Headers

const response = await fetch('/api/v3/llm/call', options);

const remaining = response.headers.get('X-RateLimit-Remaining');
const reset = response.headers.get('X-RateLimit-Reset');

if (remaining < 10) {
console.warn(`Low rate limit: ${remaining} requests remaining`);
}

Use Request Queuing

For high-volume applications, implement a queue to smooth out request bursts:

class RequestQueue {
constructor(maxPerSecond = 10) {
this.queue = [];
this.processing = false;
this.interval = 1000 / maxPerSecond;
}

async add(request) {
return new Promise((resolve, reject) => {
this.queue.push({ request, resolve, reject });
this.process();
});
}

async process() {
if (this.processing || this.queue.length === 0) return;

this.processing = true;
const { request, resolve, reject } = this.queue.shift();

try {
const result = await request();
resolve(result);
} catch (error) {
reject(error);
}

setTimeout(() => {
this.processing = false;
this.process();
}, this.interval);
}
}

Quota Management

Monthly quotas are tracked separately from rate limits.

Check Quota

curl https://llm.zihin.ai/api/quota \
-H "X-Api-Key: zhn_live_xxxxx"

Response:

{
"success": true,
"quota": {
"limit": 10000,
"used": 2500,
"remaining": 7500,
"reset_date": "2025-02-01T00:00:00.000Z"
}
}

Quota Exceeded

{
"error": "quota_exceeded",
"message": "Monthly quota exceeded",
"details": {
"limit": 10000,
"used": 10000,
"reset_date": "2025-02-01T00:00:00.000Z"
},
"status": "error"
}