Skip to main content

Auto-Routing

When you set model: "auto", the system automatically selects the best model for the task based on complexity, capabilities required, and cost optimization.

How It Works

  1. Analyze the request context (context.task, message content, multimodal attachments)
  2. Select the optimal model tier (economical, balanced, premium, flagship)
  3. Choose the best available model within that tier
  4. Apply fallback chain if the primary model fails

Task Types

Specify context.task in your request to optimize model selection:

ValueDescriptionRecommended Tier
chat_generalGeneral conversations, explanationseconomical
writeContent creation (email, post, proposal)balanced
rewrite_editRevision, tone adjustmenteconomical
extract_structureData extraction, JSON/table formattingeconomical
reasoning_analysisComplex analysis, multi-step reasoningflagship
codeCode generation, debugging, refactoringpremium
data_querySQL queries, reports, dashboardspremium
researchResearch and synthesispremium

Validation: Invalid task types return HTTP 400 with the list of valid values.

Model Tiers

TierUse CaseTypical Models
economicalSimple tasks, low latencyHaiku 4.5, GPT-4.1-nano, Gemini 2.5 Flash Lite
balancedGeneral purposeSonnet 4.5, GPT-4.1-mini
premiumHigh qualityGPT-4.1, Gemini 2.5 Flash, o3-mini
flagshipMaximum capabilityClaude Opus 4.6, o3, Gemini 2.5 Pro

Tier Eligibility by Plan

Auto-routing only selects models within the tiers allowed by your subscription plan:

PlanAllowed Tiers
freeeconomical
basiceconomical, premium
proeconomical, premium, flagship
enterpriseall

If your plan restricts certain tiers, auto-routing will only consider eligible models. For example, a basic plan using model: "auto" will never select a flagship model.

tip

To access higher-tier models, upgrade your subscription plan.

Routing Response

When auto-routing is used, the response includes routing metadata:

{
"routing": {
"is_auto_routed": true,
"model_chosen": "openai.gpt-4.1-nano",
"confidence": 0.85
}
}
FieldDescription
is_auto_routedWhether auto-routing was used
model_chosenFull model ID selected
confidenceRouting confidence score (0-1)

Multimodal Auto-Routing

When the request contains images or other multimodal content, the router automatically filters to vision-capable models:

{
"model": "auto",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image."},
{"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
]
}],
"context": { "task": "extract_structure" }
}

GET /api/llm/recommendations/:task

Get model recommendations for a specific task type.

Authentication: Not required

Cache: 10 minutes

Parameters:

ParamTypeDescription
taskstringTask type (see table above)

Response:

{
"success": true,
"task": "code_generation",
"recommendations": [
{
"id": "grok.grok-4-1-fast-reasoning",
"name": "Grok 4.1 Fast Reasoning",
"provider": "grok",
"tier": "flagship",
"context": "2000000"
}
],
"count": 5
}