Configuration Guide
OpenHuman Model Routing Explained — Smart Model Selection for Every Task
OpenHuman's model routing system automatically assigns each task to the optimal model. Instead of using one expensive model for everything, it intelligently chooses between fast, reasoning, and vision models — saving you money without sacrificing quality.
How Model Routing Works
OpenHuman has three model tiers:
- Fast model — for quick responses, simple queries, and routine tasks
- Reasoning model — for complex analysis, coding, and deep thinking
- Vision model — for image understanding and OCR
Configuration
[model_routing]
fast_model = "gpt-4o-mini"
reasoning_model = "deepseek"
vision_model = "gpt-4o"
[models.gpt-mini]
provider = "openai"
api_key = "sk-..."
model = "gpt-4o-mini"
[models.deepseek]
provider = "openai"
api_key = "sk-..."
base_url = "https://api.deepseek.com/v1"
model = "deepseek-chat"Routing Logic
OpenHuman analyzes each request and routes it:
- Simple Q&A → fast model (cheapest)
- Code generation → reasoning model
- Email summarization → fast model
- Complex analysis → reasoning model
- Image questions → vision model
- Memory retrieval → fast model
Cost Savings Example
With model routing, 70% of your queries go to the fast model (gpt-4o-mini at $0.15/1M input tokens), 25% to reasoning (DeepSeek at $0.27/1M), and 5% to vision (gpt-4o at $2.50/1M). The weighted average is ~$0.30/1M — compared to $5.00/1M if everything went to gpt-4o. That's a 94% reduction in input costs.
Custom Routing Rules
You can override routing with explicit instructions:
[model_routing.rules]
conversation_prefix_reason = ["analyze", "compare", "debug", "explain"]
conversation_prefix_fast = ["hello", "thanks", "what", "when", "who"]Using a Single Model
If you prefer simplicity, set all tiers to the same model:
[model_routing]
fast_model = "ollama"
reasoning_model = "ollama"
vision_model = "ollama"