How task-aware routing works
Each request is classified by task type (code, planning, writing, etc.) and routed to a model optimized for that task. No manual model selection required.
The problem with hardcoding models
- • model: "gpt-4" everywhere
- • No fallback on errors
- • Manual retry logic
- • 429 rate limit errors
- • Provider outages break loops
- • High latency for simple tasks
- • Overpaying for formatting
- • Wrong model for the task
- • No visibility into costs
ModelPilot: task classification + model routing in one API.
Why routing matters
Different tasks need different models. Code → DeepSeek. Writing → Claude. Simple tasks → GPT-4o-mini.
How it works
Four-step pipeline: classify, match, execute, fallback.
1. Task classification
Prompt is analyzed and classified into a task type: code generation, planning, summarization, extraction, creative writing, Q&A, etc.
2. Model matching
Task type is mapped to models with strong benchmark scores for that category. Code → DeepSeek/Codestral. Writing → Claude. Reasoning → o1/GPT-4.
3. Weighted selection
Final model is selected based on your router weights: cost, latency, quality, and optionally carbon impact. Configurable per router.
4. Automatic fallbacks
If the model returns an error (429, 500, timeout), the request is retried with a fallback model. Configurable retry count and escalation chain.
What you get
Benefits of task-aware routing over hardcoded model selection.
Start routing in 2 minutes
Create a router, change your baseURL, done. Free tier includes $5 in credits.
npm install modelpilot • OpenAI SDK compatible