Auto-Routing
Automatic model selection based on task complexity.
How It Works
When you set model: "auto", the system analyzes your prompt and selects the optimal model based on:
- Task Type - Code, analysis, creative, chat
- Complexity - Simple vs complex reasoning
- Cost Efficiency - Best value for the task
- Response Quality - Match quality to requirements
Usage
{
"query": "What is 2 + 2?",
"model": "auto"
}
Response
The response includes routing information:
{
"success": true,
"response": "4",
"model": "gpt-4o-mini",
"provider": "openai",
"routing": {
"is_auto_routed": true,
"model_chosen": "openai.gpt-4o-mini",
"confidence": 0.92,
"reason": "simple_query"
}
}
Routing Logic
| Task Type | Simple | Complex |
|---|---|---|
| Code | gpt-4.1-mini | grok-4-1-fast-reasoning |
| Analysis | claude-3-haiku | claude-opus-4-5 |
| Creative | gpt-4.1 | claude-sonnet-4-5 |
| Chat | gpt-4o-mini | gpt-4.1 |
Task Recommendations
Get model recommendations for a specific task:
curl https://llm.zihin.ai/api/llm/recommendations/code_generation
Response:
{
"success": true,
"task": "code_generation",
"recommendations": [
{
"id": "grok.grok-4-1-fast-reasoning",
"name": "Grok 4.1 Fast Reasoning",
"provider": "grok",
"tier": "flagship"
},
{
"id": "openai.gpt-4.1",
"name": "GPT-4.1",
"provider": "openai",
"tier": "standard"
}
]
}
Task Types
| Task | Description |
|---|---|
code_generation | Writing code |
summarization | Condensing text |
translation | Language translation |
analysis | Data/text analysis |
creative | Creative writing |
chat | Conversational |
Confidence Scores
The routing confidence indicates how certain the system is:
| Score | Meaning |
|---|---|
0.9+ | High confidence |
0.7-0.9 | Good match |
0.5-0.7 | Acceptable |
< 0.5 | Consider specifying model |
Override Auto-Routing
To bypass auto-routing, specify a model directly:
{
"query": "Complex task here",
"model": "claude-opus-4-5-20250514",
"provider": "anthropic"
}