API Overview

Conventions, base URLs, and common patterns for all Zihin API endpoints.

Base URLs

Service	Base URL	Description
LLM API	`https://llm.zihin.ai`	LLM calls, models, agents, telemetry
Tenant API	`https://tenant-api.zihin.ai`	Team management, roles, invites

Authentication

All requests require authentication via API Key or JWT. See Authentication for details.

# API Key
curl -H "X-Api-Key: YOUR_API_KEY" https://llm.zihin.ai/api/v3/llm/public/call

# JWT (multi-tenant)
curl -H "Authorization: Bearer <jwt>" -H "x-tenant-id: <uuid>" https://llm.zihin.ai/api/v3/llm/call

Request Format

All POST/PUT requests use JSON:

curl -X POST https://llm.zihin.ai/api/v3/llm/public/call \
  -H "X-Api-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "Hello", "model": "auto"}'

Response Format

All responses follow a standard structure:

// Success
{
  success: true,
  // ... response data
}

// Error
{
  error: string,
  message: string,
  status: "error",
  details?: object
}

Pagination

List endpoints support pagination:

GET /api/agents?page=1&limit=20

{
  data: [...],
  pagination: {
    page: number,
    limit: number,
    total: number,
    pages: number
  }
}

Common Schemas

LLMRequest

interface LLMRequest {
  query: string;           // Required: the prompt
  model?: string;          // Model ID or "auto" (default: "auto")
  provider?: string;       // Provider name (auto-detected from model)
  messages?: Message[];    // Conversation history
  temperature?: number;    // 0-2, default 0.7
  max_tokens?: number;     // Max response tokens
  stream?: boolean;        // Enable streaming
}

LLMResponse

interface LLMResponse {
  success: boolean;
  response: string;
  model: string;
  provider: string;
  usage: {
    input_tokens: number;
    output_tokens: number;
    total_tokens: number;
  };
  cost: number;
  latency_ms: number;
  routing?: {
    is_auto_routed: boolean;
    model_chosen: string;
    confidence: number;
    reason?: string;
  };
}

Rate Limits

See Rate Limits for per-plan limits and headers.

Base URLs​

Authentication​

Request Format​

Response Format​

Pagination​

Common Schemas​

LLMRequest​

LLMResponse​

Rate Limits​