API Overview
Conventions, base URLs, and common patterns for all Zihin API endpoints.
Base URLs
| Service | Base URL | Description |
|---|---|---|
| LLM API | https://llm.zihin.ai | LLM calls, models, agents, telemetry |
| Tenant API | https://tenant-api.zihin.ai | Team management, roles, invites |
Authentication
All requests require authentication via API Key or JWT. See Authentication for details.
# API Key
curl -H "X-Api-Key: YOUR_API_KEY" https://llm.zihin.ai/api/v3/llm/public/call
# JWT (multi-tenant)
curl -H "Authorization: Bearer <jwt>" -H "x-tenant-id: <uuid>" https://llm.zihin.ai/api/v3/llm/call
Request Format
All POST/PUT requests use JSON:
curl -X POST https://llm.zihin.ai/api/v3/llm/public/call \
-H "X-Api-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query": "Hello", "model": "auto"}'
Response Format
All responses follow a standard structure:
// Success
{
success: true,
// ... response data
}
// Error
{
error: string,
message: string,
status: "error",
details?: object
}
Pagination
List endpoints support pagination:
GET /api/agents?page=1&limit=20
{
data: [...],
pagination: {
page: number,
limit: number,
total: number,
pages: number
}
}
Common Schemas
LLMRequest
interface LLMRequest {
query: string; // Required: the prompt
model?: string; // Model ID or "auto" (default: "auto")
provider?: string; // Provider name (auto-detected from model)
messages?: Message[]; // Conversation history
temperature?: number; // 0-2, default 0.7
max_tokens?: number; // Max response tokens
stream?: boolean; // Enable streaming
}
LLMResponse
interface LLMResponse {
success: boolean;
response: string;
model: string;
provider: string;
usage: {
input_tokens: number;
output_tokens: number;
total_tokens: number;
};
cost: number;
latency_ms: number;
routing?: {
is_auto_routed: boolean;
model_chosen: string;
confidence: number;
reason?: string;
};
}
Rate Limits
See Rate Limits for per-plan limits and headers.