OpenHuman Model Routing Deep Dive — How Tasks Get the Right AI Model Automatically
One of OpenHuman's standout features is automatic model routing. You don't need to manually pick which model to use — the system automatically assigns the best model based on what your current task requires.
🎯 Why model routing matters
Different AI tasks have different model requirements:
- Code review / deep analysis: Needs a strong reasoning model (DeepSeek-V3, Claude Sonnet)
- Casual chat / simple queries: A fast cheap model is sufficient (GPT-4o-mini, Groq Llama)
- Image analysis: Needs a multimodal vision model (GPT-4o, Gemini 2.5 Pro)
Without routing, you either use expensive models for everything (wasting money) or cheap models that can't handle complex tasks. OpenHuman's solution: automatically identify task type, assign the optimal model.
⚙️ Three routing categories
🧠 Reasoning
Strong reasoning model. For: deep analysis, code review, complex reasoning, long document summaries.
⚡ Fast
Cheap/fast model. For: conversation, simple queries, quick generation, classifications.
👁️ Vision
Vision-capable model. For: image analysis, screenshot understanding, document OCR.
You can also force a specific route using hints in conversation:
hint:reasoning— force reasoning modelhint:fast— force fast modelhint:vision— force vision model
🔄 The decision pipeline
When you send a request to OpenHuman, the routing decision works like this:
- Analyze request: The system determines the request type — complex reasoning or casual chat? Image attached?
- Match route: Based on analysis, select the corresponding model from the route table
- Execute: The selected model processes the request
- Return: Results compressed via TokenJuice before returning to the UI
The entire process is transparent — you just send a message and the optimal model is selected automatically behind the scenes.
🏠 Local AI fallback
By default, model routing uses OpenHuman's managed backend (one subscription covers all models). But you can also configure local AI:
- Run local models via Ollama or LM Studio
- Local models handle low-sensitivity tasks (summarization, classification) first
- The routing system automatically directs efficient workloads to the local model while complex tasks still go to cloud
- Hybrid mode: local models handle most daily tasks, frontier models handle high-reasoning tasks
📊 Auto-routing vs Manual selection
| Dimension | Auto routing | Manual selection |
|---|---|---|
| Convenience | ✅ No thinking required | Must switch each time |
| Cost control | ✅ Auto-uses cheap models | Easy to overuse expensive |
| Flexibility | Automatic | ✅ Full control |
| Recommendation | ✅ Most users | Dev/tuning scenarios |