OpenHuman Guide

Configuration Guide

OpenHuman Model Routing Explained — Smart Model Selection for Every Task

2026-05-25~7 min read

OpenHuman's model routing system automatically assigns each task to the optimal model. Instead of using one expensive model for everything, it intelligently chooses between fast, reasoning, and vision models — saving you money without sacrificing quality.

How Model Routing Works

OpenHuman has three model tiers:

  • Fast model — for quick responses, simple queries, and routine tasks
  • Reasoning model — for complex analysis, coding, and deep thinking
  • Vision model — for image understanding and OCR

Configuration

[model_routing]
fast_model = "gpt-4o-mini"
reasoning_model = "deepseek"
vision_model = "gpt-4o"

[models.gpt-mini]
provider = "openai"
api_key = "sk-..."
model = "gpt-4o-mini"

[models.deepseek]
provider = "openai"
api_key = "sk-..."
base_url = "https://api.deepseek.com/v1"
model = "deepseek-chat"

Routing Logic

OpenHuman analyzes each request and routes it:

  • Simple Q&A → fast model (cheapest)
  • Code generation → reasoning model
  • Email summarization → fast model
  • Complex analysis → reasoning model
  • Image questions → vision model
  • Memory retrieval → fast model

Cost Savings Example

With model routing, 70% of your queries go to the fast model (gpt-4o-mini at $0.15/1M input tokens), 25% to reasoning (DeepSeek at $0.27/1M), and 5% to vision (gpt-4o at $2.50/1M). The weighted average is ~$0.30/1M — compared to $5.00/1M if everything went to gpt-4o. That's a 94% reduction in input costs.

Custom Routing Rules

You can override routing with explicit instructions:

[model_routing.rules]
conversation_prefix_reason = ["analyze", "compare", "debug", "explain"]
conversation_prefix_fast = ["hello", "thanks", "what", "when", "who"]

Using a Single Model

If you prefer simplicity, set all tiers to the same model:

[model_routing]
fast_model = "ollama"
reasoning_model = "ollama"
vision_model = "ollama"