Model Catalog
The LLM API supports 35+ models from 4 providers, managed dynamically via the llm_models catalog.
Providers
| Provider | Main Models |
|---|---|
| OpenAI | GPT-4.1, GPT-4.1-mini, GPT-4.1-nano, o3, o3-mini, o4-mini |
| Anthropic | Claude Opus 4.6, Claude Sonnet 4.5, Claude Haiku 4.5 |
| Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.5 Flash Lite | |
| Grok | Grok 3, Grok 3 Mini, Grok 4 Fast, Grok Code Fast |
Model ID Format
Models use the provider.model format:
openai.gpt-4.1-nano
anthropic.claude-sonnet-4-5-20250929
google.gemini-2.5-flash
grok.grok-3-mini
Use auto to let the system select the best model for your task (see Auto-Routing).
Model Tier Access by Plan
Models are grouped into tiers. Your subscription plan determines which tiers you can access:
| Plan | Allowed Tiers | Example Models |
|---|---|---|
free | economical | GPT-4.1-nano, Haiku 4.5, Gemini 2.5 Flash Lite |
basic | economical, premium | + GPT-4.1, Sonnet 4.5, Gemini 2.5 Flash |
pro | economical, premium, flagship | + Claude Opus 4.6, o3, Gemini 2.5 Pro |
enterprise | all | + legacy models |
Requesting a model outside your plan's allowed tiers returns HTTP 403:
{
"error": "MODEL_TIER_RESTRICTED",
"message": "Model 'claude-opus-4-6' (tier: flagship) is not available for the 'basic' plan.",
"details": {
"model_tier": "flagship",
"plan_code": "basic",
"allowed_tiers": ["economical", "premium"]
}
}
Use model: "auto" to automatically select the best model within your plan's allowed tiers.
Feature Support by Model
Not all models support every feature. The tools_support field in the catalog tracks per-model capabilities:
| Feature | Description | Unsupported Models |
|---|---|---|
function_calling | Tool/function calling | None (all active models) |
structured_output | JSON Schema response | Anthropic models |
stop_sequences | Custom stop sequences | Reasoning models (see below) |
Stop Sequences Compatibility
Reasoning models do not support the stop parameter. The API silently skips it for these models.
| Provider | Models WITHOUT stop support | Reason |
|---|---|---|
| OpenAI | o1, o3, o3-mini, o4-mini, gpt-5, gpt-5-mini, gpt-5-nano | Reasoning models reject stop |
| Grok | grok-3-mini, grok-4-0709, grok-4-fast-reasoning, grok-4-1-fast-reasoning | xAI returns 400 for reasoning models |
| Anthropic | (none) | All models support stop |
| (none) | All models support stop |
Use the tools_support.stop_sequences field from the model catalog to check programmatically:
SELECT model_id, tools_support->>'stop_sequences' as stop_support
FROM llm_models WHERE is_active = true;
GET /api/llm/models
List all available models.
Authentication: Not required
Cache: 5 minutes
Response:
{
"success": true,
"count": 38,
"models": [
{
"id": "openai.o3-mini",
"name": "o3-mini",
"provider": "openai",
"description": "Fast and cost-effective reasoning model",
"tier": "premium",
"context": "200000",
"capabilities": ["reasoning", "code", "math"],
"performance": {
"latency": 7,
"quality": 8,
"cost": 7
},
"auto_routing_enabled": true
}
]
}
Model Fields:
| Field | Type | Description |
|---|---|---|
id | string | Full model ID (provider.model) |
name | string | Model name |
provider | string | Provider name |
description | string | Short description |
tier | string | Model tier: economical, balanced, premium, flagship |
context | string | Context window size (tokens) |
capabilities | array | Model capabilities (reasoning, code, vision, etc.) |
performance | object | Performance scores (1-10 scale) |
auto_routing_enabled | boolean | Whether the model participates in auto-routing |
GET /api/v3/llm/models/stats
Get global usage statistics per model.
Authentication: Not required
Cache: 15 minutes
Query Parameters:
| Param | Type | Default | Description |
|---|---|---|---|
days | integer | 30 | Number of days to analyze (1-365) |
Response:
{
"success": true,
"period_days": 30,
"generated_at": "2026-01-18T12:00:00.000Z",
"data": [
{
"provider": "openai",
"model": "gpt-4o",
"total_tokens": 12500000,
"total_requests": 5000,
"avg_latency_ms": 450.25,
"last_used": "2026-01-18T11:30:00.000Z"
}
],
"count": 15
}
GET /api/llm/provider/:provider
Get information about a specific provider.
Authentication: Not required
Cache: 15 minutes
Parameters:
| Param | Type | Description |
|---|---|---|
provider | string | openai, anthropic, google, grok |
Response:
{
"success": true,
"provider": "openai",
"supportedModels": [
"gpt-4.1", "gpt-4.1-mini", "gpt-4.1-nano",
"gpt-4o", "gpt-4o-mini", "o3", "o3-mini", "o4-mini"
],
"modelCount": 8
}