Model Routing
Request tiers, not models. The router maps tier intent to concrete model IDs and resolves overrides in a fixed priority order.
Mental model
The model router maps abstract tier names to concrete model IDs. Code and skills request a tier, not a model. The router resolves the tier at the moment of the call, so you can swap models across the whole runtime by editing one config section — no callsite changes.
Four tiers cover the space of tasks: balanced for most conversation, smart for multi-step reasoning, coding for code generation and refactoring, deep for long-horizon analysis. Every request starts at one tier and can be overridden by the user, a skill, or the runtime itself.
Why a tier is an abstraction
A tier is an intent, not a capability floor. It describes what kind of work the caller thinks it is doing, not how powerful the model must be. This matters because it lets you change providers and model generations without touching any code that asks for coding — the intent is stable, only the binding moves.
This indirection is the reason skills declare tier: coding instead of a model ID. A skill authored in 2024 still works in 2027 without edits; the operator just updates the router.
The four tiers
balanced
General chat and simple tasks. The default tier for turns that do not explicitly ask for something else.
Default
smart
Multi-step analysis, planning, cross-file reasoning. Used when the balanced tier would be under-spec.
Reasoning
coding
Code generation, debugging, refactoring. Bound to a code-specialized model when one is available.
Code
deep
Long-horizon analysis, research, problem solving that benefits from extended reasoning time.
Analysis
Tier resolution priority
When more than one source wants to set the tier, the router resolves in this fixed order (highest wins):
- User override —
/model openai/gpt-5.1or/tier codingcommand from the operator, per turn. - Skill override —
tier: codingin the skill frontmatter, for the duration of the skill. - Dynamic escalation — the runtime upgrades a turn from
balancedto a higher tier when it detects the request needs it. Only active whendynamicTierEnabledistrue. - Default —
balancedModel.
Inspecting what the router picked
Look for Tier assignment: and Dynamic tier upgrade: in the runtime logs. Use /status and /tier in chat to see the current active tier. The Sessions page records tier for every turn.
Configuration
Configure in Settings → Model Router or in preferences/model-router.json. Model IDs use the provider/model format; the provider prefix must match a key in the LLM providers section.
{
"modelRouter": {
"routingModel": "openai/gpt-5.2-codex",
"balancedModel": "openai/gpt-5.1",
"smartModel": "anthropic/claude-sonnet-4-20250514",
"codingModel": "openai/gpt-5.2",
"deepModel": "anthropic/claude-opus-4-20250514",
"dynamicTierEnabled": true,
"temperature": 0.7
}
}Recommended defaults
If you are setting this up for the first time and do not yet know what load looks like, use the following and adjust after a week of real traffic.
- balanced — mid-size general model from your primary provider (for example
gpt-5.1). Most turns land here. - coding — a code-specialized model from the same provider if available, otherwise the same as
balanced. - smart — leave unset initially; dynamic escalation will promote turns that need it into
balanced. Set a real value only after you see turns that should have been escalated. - deep — only set when you have a concrete use case (research skill, long-horizon auto mode). An unset
deeptier falls back tosmart, thenbalanced. - dynamicTierEnabled —
true. Turn it off only if you need strict cost control and want every turn to stay at its declared tier.
When to deviate
If you are running in a regulated environment with one approved provider, point all four tiers at models from that provider. If you are optimizing for cost, set dynamicTierEnabled: false and size tiers to the floor that works. If you are running a research workload, set deep to a reasoning-optimized model and use it explicitly from skills.
Dynamic escalation and skills
Dynamic escalation uses the routingModel (a small, fast model) to classify the incoming turn and decide whether to promote it. Escalation never downgrades — a user or skill override always wins.
Skills that need a specific tier should declare it in frontmatter. See Skills for the full SKILL.md contract.
Related pages
User Guide
Configuration
How configuration is layered across workspace storage, runtime sections, and user preferences. The dashboard edits most of it.
User Guide
Skills
What a skill is, how sticky activation works, and the SKILL.md contract. For concrete recipes, see the Cookbook.
Reference
Glossary
One-line definitions for the terms that appear across the documentation. Use this when a word seems to mean something specific and you want the canonical meaning.