Auto-Routing
When you set model: "auto", the system automatically selects the best model for the task based on complexity, capabilities required, and cost optimization.
How It Works
- Analyze the request context (
context.task, message content, multimodal attachments) - Select the optimal model tier (economical, balanced, premium, flagship)
- Choose the best available model within that tier
- Apply fallback chain if the primary model fails
Task Types
Specify context.task in your request to optimize model selection:
| Value | Description | Recommended Tier |
|---|---|---|
chat_general | General conversations, explanations | economical |
write | Content creation (email, post, proposal) | balanced |
rewrite_edit | Revision, tone adjustment | economical |
extract_structure | Data extraction, JSON/table formatting | economical |
reasoning_analysis | Complex analysis, multi-step reasoning | flagship |
code | Code generation, debugging, refactoring | premium |
data_query | SQL queries, reports, dashboards | premium |
research | Research and synthesis | premium |
Validation: Invalid task types return HTTP 400 with the list of valid values.
Model Tiers
| Tier | Use Case | Typical Models |
|---|---|---|
economical | Simple tasks, low latency | Haiku 4.5, GPT-4.1-nano, Gemini 2.5 Flash Lite |
balanced | General purpose | Sonnet 4.5, GPT-4.1-mini |
premium | High quality | GPT-4.1, Gemini 2.5 Flash, o3-mini |
flagship | Maximum capability | Claude Opus 4.6, o3, Gemini 2.5 Pro |
Tier Eligibility by Plan
Auto-routing only selects models within the tiers allowed by your subscription plan:
| Plan | Allowed Tiers |
|---|---|
free | economical |
basic | economical, premium |
pro | economical, premium, flagship |
enterprise | all |
If your plan restricts certain tiers, auto-routing will only consider eligible models. For example, a basic plan using model: "auto" will never select a flagship model.
To access higher-tier models, upgrade your subscription plan.
Routing Response
When auto-routing is used, the response includes routing metadata:
{
"routing": {
"is_auto_routed": true,
"model_chosen": "openai.gpt-4.1-nano",
"confidence": 0.85
}
}
| Field | Description |
|---|---|
is_auto_routed | Whether auto-routing was used |
model_chosen | Full model ID selected |
confidence | Routing confidence score (0-1) |
Multimodal Auto-Routing
When the request contains images or other multimodal content, the router automatically filters to vision-capable models:
{
"model": "auto",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image."},
{"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
]
}],
"context": { "task": "extract_structure" }
}
GET /api/llm/recommendations/:task
Get model recommendations for a specific task type.
Authentication: Not required
Cache: 10 minutes
Parameters:
| Param | Type | Description |
|---|---|---|
task | string | Task type (see table above) |
Response:
{
"success": true,
"task": "code_generation",
"recommendations": [
{
"id": "grok.grok-4-1-fast-reasoning",
"name": "Grok 4.1 Fast Reasoning",
"provider": "grok",
"tier": "flagship",
"context": "2000000"
}
],
"count": 5
}