Auto-Routing

When you set model: "auto", the system automatically selects the best model for the task based on complexity, capabilities required, and cost optimization.

How It Works

Analyze the request context (context.task, message content, multimodal attachments)
Select the optimal model tier (economical, balanced, premium, flagship)
Choose the best available model within that tier
Apply fallback chain if the primary model fails

Task Types

Specify context.task in your request to optimize model selection:

Value	Description	Recommended Tier
`chat_general`	General conversations, explanations	economical
`write`	Content creation (email, post, proposal)	balanced
`rewrite_edit`	Revision, tone adjustment	economical
`extract_structure`	Data extraction, JSON/table formatting	economical
`reasoning_analysis`	Complex analysis, multi-step reasoning	flagship
`code`	Code generation, debugging, refactoring	premium
`data_query`	SQL queries, reports, dashboards	premium
`research`	Research and synthesis	premium

Validation: Invalid task types return HTTP 400 with the list of valid values.

Model Tiers

Tier	Use Case	Typical Models
`economical`	Simple tasks, low latency	Haiku 4.5, GPT-4.1-nano, Gemini 2.5 Flash Lite
`balanced`	General purpose	Sonnet 4.5, GPT-4.1-mini
`premium`	High quality	GPT-4.1, Gemini 2.5 Flash, o3-mini
`flagship`	Maximum capability	Claude Opus 4.6, o3, Gemini 2.5 Pro

Tier Eligibility by Plan

Auto-routing only selects models within the tiers allowed by your subscription plan:

Plan	Allowed Tiers
`free`	economical
`basic`	economical, premium
`pro`	economical, premium, flagship
`enterprise`	all

If your plan restricts certain tiers, auto-routing will only consider eligible models. For example, a basic plan using model: "auto" will never select a flagship model.

tip

To access higher-tier models, upgrade your subscription plan.

Routing Response

When auto-routing is used, the response includes routing metadata:

{
  "routing": {
    "is_auto_routed": true,
    "model_chosen": "openai.gpt-4.1-nano",
    "confidence": 0.85
  }
}

Field	Description
`is_auto_routed`	Whether auto-routing was used
`model_chosen`	Full model ID selected
`confidence`	Routing confidence score (0-1)

Multimodal Auto-Routing

When the request contains images or other multimodal content, the router automatically filters to vision-capable models:

{
  "model": "auto",
  "messages": [{
    "role": "user",
    "content": [
      {"type": "text", "text": "Describe this image."},
      {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
    ]
  }],
  "context": { "task": "extract_structure" }
}

GET /api/llm/recommendations/:task

Get model recommendations for a specific task type.

Authentication: Not required

Cache: 10 minutes

Parameters:

Param	Type	Description
`task`	string	Task type (see table above)

Response:

{
  "success": true,
  "task": "code_generation",
  "recommendations": [
    {
      "id": "grok.grok-4-1-fast-reasoning",
      "name": "Grok 4.1 Fast Reasoning",
      "provider": "grok",
      "tier": "flagship",
      "context": "2000000"
    }
  ],
  "count": 5
}

How It Works​

Task Types​

Model Tiers​

Tier Eligibility by Plan​

Routing Response​

Multimodal Auto-Routing​

GET /api/llm/recommendations/:task​