OpenHuman Guide
← Back to Guides

OpenHuman Model Routing Deep Dive — How Tasks Get the Right AI Model Automatically

One of OpenHuman's standout features is automatic model routing. You don't need to manually pick which model to use — the system automatically assigns the best model based on what your current task requires.

🎯 Why model routing matters

Different AI tasks have different model requirements:

  • Code review / deep analysis: Needs a strong reasoning model (DeepSeek-V3, Claude Sonnet)
  • Casual chat / simple queries: A fast cheap model is sufficient (GPT-4o-mini, Groq Llama)
  • Image analysis: Needs a multimodal vision model (GPT-4o, Gemini 2.5 Pro)

Without routing, you either use expensive models for everything (wasting money) or cheap models that can't handle complex tasks. OpenHuman's solution: automatically identify task type, assign the optimal model.

⚙️ Three routing categories

🧠 Reasoning

Strong reasoning model. For: deep analysis, code review, complex reasoning, long document summaries.

⚡ Fast

Cheap/fast model. For: conversation, simple queries, quick generation, classifications.

👁️ Vision

Vision-capable model. For: image analysis, screenshot understanding, document OCR.

You can also force a specific route using hints in conversation:

  • hint:reasoning — force reasoning model
  • hint:fast — force fast model
  • hint:vision — force vision model

🔄 The decision pipeline

When you send a request to OpenHuman, the routing decision works like this:

  1. Analyze request: The system determines the request type — complex reasoning or casual chat? Image attached?
  2. Match route: Based on analysis, select the corresponding model from the route table
  3. Execute: The selected model processes the request
  4. Return: Results compressed via TokenJuice before returning to the UI

The entire process is transparent — you just send a message and the optimal model is selected automatically behind the scenes.

🏠 Local AI fallback

By default, model routing uses OpenHuman's managed backend (one subscription covers all models). But you can also configure local AI:

  • Run local models via Ollama or LM Studio
  • Local models handle low-sensitivity tasks (summarization, classification) first
  • The routing system automatically directs efficient workloads to the local model while complex tasks still go to cloud
  • Hybrid mode: local models handle most daily tasks, frontier models handle high-reasoning tasks

📊 Auto-routing vs Manual selection

DimensionAuto routingManual selection
Convenience✅ No thinking requiredMust switch each time
Cost control✅ Auto-uses cheap modelsEasy to overuse expensive
FlexibilityAutomatic✅ Full control
Recommendation✅ Most usersDev/tuning scenarios