Skip to main content

API Overview

Conventions, base URLs, and common patterns for all Zihin API endpoints.

Base URLs

ServiceBase URLDescription
LLM APIhttps://llm.zihin.aiLLM calls, models, agents, telemetry
Tenant APIhttps://tenant-api.zihin.aiTeam management, roles, invites

Authentication

All requests require authentication via API Key or JWT. See Authentication for details.

# API Key
curl -H "X-Api-Key: YOUR_API_KEY" https://llm.zihin.ai/api/v3/llm/public/call

# JWT (multi-tenant)
curl -H "Authorization: Bearer <jwt>" -H "x-tenant-id: <uuid>" https://llm.zihin.ai/api/v3/llm/call

Request Format

All POST/PUT requests use JSON:

curl -X POST https://llm.zihin.ai/api/v3/llm/public/call \
-H "X-Api-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query": "Hello", "model": "auto"}'

Response Format

All responses follow a standard structure:

// Success
{
success: true,
// ... response data
}

// Error
{
error: string,
message: string,
status: "error",
details?: object
}

Pagination

List endpoints support pagination:

GET /api/agents?page=1&limit=20
{
data: [...],
pagination: {
page: number,
limit: number,
total: number,
pages: number
}
}

Common Schemas

LLMRequest

interface LLMRequest {
query: string; // Required: the prompt
model?: string; // Model ID or "auto" (default: "auto")
provider?: string; // Provider name (auto-detected from model)
messages?: Message[]; // Conversation history
temperature?: number; // 0-2, default 0.7
max_tokens?: number; // Max response tokens
stream?: boolean; // Enable streaming
}

LLMResponse

interface LLMResponse {
success: boolean;
response: string;
model: string;
provider: string;
usage: {
input_tokens: number;
output_tokens: number;
total_tokens: number;
};
cost: number;
latency_ms: number;
routing?: {
is_auto_routed: boolean;
model_chosen: string;
confidence: number;
reason?: string;
};
}

Rate Limits

See Rate Limits for per-plan limits and headers.