OpenHuman Smart Token Compression (TokenJuice) — Reduce Costs by 80%

LLM tokens are expensive, and verbose tool output is where most of them go to die. A git status in a busy repo, a 600-message email thread, a docker ps -a against a real cluster — each can balloon a context window for almost no information gain.

OpenHuman ships with TokenJuice, a port of vincentkoc/tokenjuice integrated directly into the tool-execution path. Before any tool result reaches the model, TokenJuice strips the noise and keeps the signal.

💰 The result

Ingesting six months of email through a frontier model costs single-digit dollars instead of hundreds.

🔧 Three-layer rule overlay

Rules are JSON. They merge in this order, later layers override earlier ones:

Layer	Path	Purpose
Builtin	shipped with binary	Sensible defaults for git, npm, cargo, docker, kubectl, ls, etc.
User	`~/.config/tokenjuice/rules/`	Your personal overrides, apply across every project
Project	`.tokenjuice/rules/`	Repo-specific overrides, check in, share with team

Reduction strategies

Each rule names a tool/command pattern and a strategy:

Truncate — cut output after N lines
Dedup lines — remove repeated identical lines
Fold whitespace — collapse multiple blank lines
Drop matching regexes — remove lines matching patterns
Summarize sections — replace verbose sections with concise summaries

New rules are just JSON files. No recompile required.

🔬 Where it lives in the pipeline

tool call result
      │
      ▼
TokenJuice (classify → match rule → reduce)
      │
      ▼
LLM context

Implementation lives in src/openhuman/tokenjuice/ (classify.rs, reduce.rs, rules/compiler.rs, tool_integration.rs).

📧 Why this matters for memory

TokenJuice is what makes auto-fetch economically viable. When the Gmail provider syncs 200 messages:

Each canonicalized email goes through TokenJuice compression
The compressed output enters the model that builds summaries
The same applies to GitHub diffs, Slack dumps, and any other firehose source

Without it, auto-fetching multiple services would burn through API budgets quickly. With it, you can keep every source connected and synced for pennies.

🔍 Inspecting and overriding

Drop a JSON file in ~/.config/tokenjuice/rules/ for global overrides
Drop one in .tokenjuice/rules/ inside a repo for project-specific rules
Run with RUST_LOG=openhuman_core::openhuman::tokenjuice=debug to see what's matching

🌐 Multi-byte text preservation

CJK characters, emoji, and other multi-byte text are preserved grapheme-by-grapheme — never stripped. You get the same information but at a fraction of the token cost.