English | 简体中文
Route prompts by difficulty, not habit.
UncommonRoute is a local LLM router that sits between your client and your model provider. It sends easy requests to cheaper models, hard requests to stronger models, and keeps a fallback chain ready when the first choice fails.
Built for real tools like Codex, Claude Code, Cursor, the OpenAI SDK, and OpenClaw.
Held-out routing benchmark: 92.3% accuracy · Average routing latency: ~0.5ms · Simulated coding-session savings vs always-Opus: 67%
Quick Start · Connect Your Client · Agent Quick Reference · How Routing Works
Most AI tools send every request to the same model.
That is simple, but it is usually wasteful:
- "What is 2+2?" does not need the same model as "Design a fault-tolerant distributed database".
- Tool-heavy agent loops often spend most of their time on boring middle steps.
- Switching your whole workflow to the most expensive model is easy, but expensive.
UncommonRoute fixes that by making one local decision per request:
- Classify how difficult the request is.
- Pick a model for that difficulty and routing profile.
- Keep fallbacks ready if the upstream rejects or fails.
You keep one local endpoint. The router handles the model choice.
Your client
(Codex / Claude Code / Cursor / OpenAI SDK)
|
v
UncommonRoute
(runs on your machine)
|
v
Your upstream API
(Parallax / Commonstack / OpenAI / Ollama / vLLM / ...)
Examples: Parallax, Commonstack, OpenAI, Ollama, vLLM.
Important terms:
| Term | Plain-English meaning |
|---|---|
| Client | The thing you already use, like Codex or Claude Code |
| Upstream | The real model API that generates responses |
| Profile | A routing strategy like auto, eco, or premium |
| Tier | The difficulty bucket: SIMPLE, MEDIUM, COMPLEX, REASONING |
| Virtual model | A special model name like uncommon-route/auto that means "pick for me" |
The most important beginner fact: UncommonRoute does not host models. It routes requests to an upstream provider that you choose.
If you are brand new, follow these steps in order.
- Python 3.11 or newer
- A terminal
- For real chat responses: one upstream API
Good upstream choices:
- Commonstack if you want one key that can reach multiple providers
- OpenAI if you already use OpenAI directly
- Parallax / Ollama / vLLM if you want to route to a local OpenAI-compatible server
pip install uncommon-routeOr use the installer:
curl -fsSL https://anjieyang.github.io/uncommon-route/install | bashThis step does not need an API key.
uncommon-route route "write a Python function that validates email addresses"
uncommon-route debug "prove that sqrt(2) is irrational"What this proves:
- the package is installed
- the local classifier works
- the router can choose a tier and model
What this does not prove:
- your upstream is configured
- your client can talk through the proxy
Pick one example and export the environment variables.
# Commonstack: one key, many providers
export UNCOMMON_ROUTE_UPSTREAM="https://api.commonstack.ai/v1"
export UNCOMMON_ROUTE_API_KEY="csk-..."# OpenAI direct
export UNCOMMON_ROUTE_UPSTREAM="https://api.openai.com/v1"
export UNCOMMON_ROUTE_API_KEY="sk-..."# Parallax scheduler endpoint (experimental local OpenAI-style upstream)
export UNCOMMON_ROUTE_UPSTREAM="http://127.0.0.1:3001/v1"# Other local OpenAI-compatible servers (Ollama, vLLM, etc.)
export UNCOMMON_ROUTE_UPSTREAM="http://127.0.0.1:11434/v1"If your upstream does not need a key, you can skip UNCOMMON_ROUTE_API_KEY.
Parallax is listed as experimental here: its public docs and source clearly expose POST /v1/chat/completions, but I could not find a public /v1/models route, so UncommonRoute model discovery may be limited.
uncommon-route serveIf your upstream is configured, you should see a banner with:
- the upstream host
- the local proxy URL
- the dashboard URL
- a quick health-check command
If your upstream is not configured yet, the banner tells you exactly which export commands to run next.
uncommon-route doctor
curl http://127.0.0.1:8403/healthdoctor is the first command to run when anything feels off.
If you are using a local upstream like Ollama or vLLM, make sure that local server is already running before you expect doctor to pass the reachability check.
Pick the client you already use:
| If you use | Do this |
|---|---|
| Codex | uncommon-route setup codex |
| Claude Code | uncommon-route setup claude-code |
| OpenAI SDK / Cursor | uncommon-route setup openai |
| OpenClaw | openclaw plugins install @anjieyang/uncommon-route |
Each setup command prints the exact next step for your shell or client.
You only need one of these sections.
Codex · OpenAI-compatible local routing for Codex
uncommon-route setup codexThat command prints the exact shell config to add. Manually, the important part is:
export OPENAI_BASE_URL="http://localhost:8403/v1"
export OPENAI_API_KEY="not-needed"Then:
uncommon-route serve
codexFor smart routing, use:
model = "uncommon-route/auto"
Claude Code · Anthropic-style local routing for Claude Code
uncommon-route setup claude-codeManually, the important part is:
export ANTHROPIC_BASE_URL="http://localhost:8403"
export ANTHROPIC_API_KEY="not-needed"Then:
uncommon-route serve
claudeClaude Code talks to the Anthropic-style /v1/messages endpoint. UncommonRoute converts formats and handles smart routing automatically.
OpenAI SDK / Cursor · One local OpenAI-compatible base URL
uncommon-route setup openaiPython example:
from openai import OpenAI
client = OpenAI(
base_url="http://127.0.0.1:8403/v1",
api_key="not-needed",
)
response = client.chat.completions.create(
model="uncommon-route/auto",
messages=[{"role": "user", "content": "hello"}],
)Cursor users can point "OpenAI Base URL" to http://localhost:8403/v1.
OpenClaw · Plugin-based integration
openclaw plugins install @anjieyang/uncommon-routeThe plugin handles dependency installation, proxy startup, and registration.
If you are wiring UncommonRoute into another tool, script, or agent loop, this is the minimum contract to know.
| Client type | Base URL |
|---|---|
| OpenAI-compatible clients | http://127.0.0.1:8403/v1 |
| Anthropic-style clients | http://127.0.0.1:8403 |
| Model ID | What it means |
|---|---|
uncommon-route/auto |
Balanced default |
uncommon-route/eco |
Cheapest capable model first |
uncommon-route/premium |
Quality-first routing |
uncommon-route/free |
Free-first, then cheapest capable fallback |
uncommon-route/agentic |
Tool-heavy workflow routing |
uncommon-route route --json --no-feedback "summarize this log file"
uncommon-route doctor
uncommon-route stats
uncommon-route logs --followx-uncommon-route-modelx-uncommon-route-tierx-uncommon-route-profilex-uncommon-route-stepx-uncommon-route-reasoning
| Endpoint | Why you would use it |
|---|---|
GET /health |
Basic liveness and config status |
GET /v1/models |
Virtual models exposed by the router |
GET /v1/models/mapping |
Internal model names mapped to upstream names |
GET /v1/stats |
Routing analytics summary |
POST /v1/stats |
Reset routing analytics |
GET /v1/stats/recent |
Recent routed requests and feedback state |
GET /v1/selector |
Inspect selector state and live routing preferences |
POST /v1/selector |
Preview routing for a prompt or request body |
GET /dashboard/ |
Human-friendly monitoring UI |
Your integration is "live" when all of these are true:
uncommon-route doctorshows the upstream and key are configuredGET /healthreturns{"status": "ok", ...}- routed requests include
x-uncommon-route-modelandx-uncommon-route-tier
CLI · Inspect routing locally without sending a real upstream request
Use the CLI when you want to inspect routing locally without sending a real request upstream.
uncommon-route route "what is 2+2"
uncommon-route route --json --no-feedback "design a distributed database"
uncommon-route debug "explain quicksort"What each command is for:
route: get the chosen tier, model, savings estimate, and fallback chainroute --json: same information in machine-readable formdebug: see the feature breakdown behind the classification
Python SDK · Call the router directly inside Python
Use the SDK when you want routing decisions directly inside Python.
from uncommon_route import classify, route
decision = route("explain the Byzantine Generals Problem")
print(decision.model)
print(decision.tier)
print(decision.confidence)
result = classify("hello")
print(result.tier)
print(result.signals)HTTP Proxy · Put UncommonRoute in front of real clients and apps
Use the proxy when you want real applications to send requests through UncommonRoute.
uncommon-route serve --port 8403OpenAI-compatible example:
from openai import OpenAI
client = OpenAI(
base_url="http://127.0.0.1:8403/v1",
api_key="not-needed",
)
response = client.chat.completions.create(
model="uncommon-route/auto",
messages=[{"role": "user", "content": "hello"}],
)Non-virtual model names are passed through unchanged, so you can still target a specific model when you want to.
After starting the proxy, open:
http://127.0.0.1:8403/dashboard/
The dashboard shows:
- request counts, latency, cost, and savings
- tier and model distribution
- upstream transport and cache behavior
- live routing configuration
- active sessions
- spend limits and recent usage
Useful local commands:
uncommon-route doctor
uncommon-route serve --daemon
uncommon-route stop
uncommon-route logs
uncommon-route logs --follow
uncommon-route sessions
uncommon-route statsBackground mode writes to:
- PID:
~/.uncommon-route/serve.pid - Logs:
~/.uncommon-route/serve.log
| Variable | Default | Meaning |
|---|---|---|
UNCOMMON_ROUTE_UPSTREAM |
— | Upstream OpenAI-compatible API URL |
UNCOMMON_ROUTE_API_KEY |
— | API key for the upstream provider |
UNCOMMON_ROUTE_PORT |
8403 |
Local proxy port |
UNCOMMON_ROUTE_DISABLED |
false |
Disable routing and act as passthrough |
UNCOMMON_ROUTE_COMPOSITION_CONFIG |
— | Path to a composition-policy JSON file |
UNCOMMON_ROUTE_COMPOSITION_CONFIG_JSON |
— | Inline composition-policy JSON |
If you have direct API keys for providers and want the router to prefer those models, register them:
uncommon-route provider add openai sk-your-openai-key
uncommon-route provider add anthropic sk-ant-your-key
uncommon-route provider listBYOK keys are verified on add when possible. Provider config is stored at:
~/.uncommon-route/providers.json
You can override the default model table per profile and tier:
uncommon-route config show
uncommon-route config set-tier auto SIMPLE moonshot/kimi-k2.5 --fallback google/gemini-2.5-flash-lite,deepseek/deepseek-chat
uncommon-route config set-tier premium COMPLEX anthropic/claude-opus-4.6 --fallback anthropic/claude-sonnet-4.6 --mode hard-pin
uncommon-route config reset-tier auto SIMPLEUse --mode hard-pin when you want a tier to stay on the configured primary model unless that model actually fails upstream.
Set safety limits to stop runaway cost:
uncommon-route spend set per_request 0.10
uncommon-route spend set hourly 5.00
uncommon-route spend set daily 20.00
uncommon-route spend set session 3.00
uncommon-route spend status
uncommon-route spend historyWhen a limit is hit, the proxy returns HTTP 429 with reset_in_seconds.
Spending data is stored at:
~/.uncommon-route/spending.json
You do not need to understand every internal detail to use the tool, but this mental model helps.
| Tier | Typical requests | Default primary |
|---|---|---|
SIMPLE |
greetings, short lookups, basic translation | moonshot/kimi-k2.5 |
MEDIUM |
code tasks, explanations, summaries | moonshot/kimi-k2.5 |
COMPLEX |
multi-constraint design and implementation work | google/gemini-3.1-pro |
REASONING |
proofs, derivations, hard mathematical reasoning | xai/grok-4-1-fast-reasoning |
| Profile | Best for |
|---|---|
auto |
balanced default |
eco |
lowest expected cost |
premium |
quality-first |
free |
free-first, then cheapest capable fallback |
agentic |
tool-heavy workflows |
The selector considers:
- profile preferences
- estimated token cost
- observed latency and reliability
- cache affinity
- explicit user feedback
- BYOK and free/local biases
By default, sessions:
- hold on to an already-adequate model within a task
- upgrade when a task becomes harder
- avoid needless downgrade churn
- expire after 30 minutes of inactivity
Tool-heavy workflows often contain cheap middle steps.
UncommonRoute detects cases like:
- tool selection
- tool-result follow-up
- general chat turns
That allows it to use cheaper tool-capable models for boring steps and save stronger reasoning models for the turns that actually need them.
If you are new, these are the mistakes people hit most often.
uncommon-route route ... is a local routing decision. It does not call your upstream.
If real chat requests fail:
- check
UNCOMMON_ROUTE_UPSTREAM - check
UNCOMMON_ROUTE_API_KEYif your provider needs one - run
uncommon-route doctor
For OpenAI-style tools, OPENAI_BASE_URL must end with /v1:
export OPENAI_BASE_URL="http://localhost:8403/v1"For Anthropic-style tools, ANTHROPIC_BASE_URL should point at the router root, not /v1:
export ANTHROPIC_BASE_URL="http://localhost:8403"Start here:
uncommon-route doctorThat one command usually tells you what is missing.
Once the basics are working, these are the features that make the router more powerful.
Different upstreams use different model IDs. UncommonRoute fetches /v1/models, maps internal names to upstream names, and retries through the fallback chain if the first model is unavailable.
Useful commands:
uncommon-route doctor
curl http://127.0.0.1:8403/v1/models/mappingVery large tool outputs are not always forwarded verbatim.
The proxy can:
- compact oversized text and JSON
- offload large tool results into local artifacts
- create semantic side-channel summaries
- checkpoint long histories
- rehydrate
artifact://...references on demand
Artifacts are stored under:
~/.uncommon-route/artifacts/
Useful response headers:
x-uncommon-route-input-beforex-uncommon-route-input-afterx-uncommon-route-artifactsx-uncommon-route-semantic-callsx-uncommon-route-semantic-fallbacksx-uncommon-route-checkpointsx-uncommon-route-rehydrated
When routing lands on an Anthropic-family model and the upstream supports it, UncommonRoute can preserve Anthropic-native transport and caching semantics while still serving OpenAI-style clients normally.
The classifier is local, not a SaaS black box. You can retrain it on your own benchmark data:
python - <<'PY'
from uncommon_route.router.classifier import train_and_save_model
train_and_save_model("bench/data/train.jsonl")
PYTwo questions matter:
- Does the router classify difficulty correctly?
- Does that save real money in a realistic coding session?
Evaluated on 763 hand-written prompts across 15 languages and 35 categories.
| Metric | UncommonRoute | ClawRouter | NotDiamond (cost) |
|---|---|---|---|
| Accuracy | 92.3% | 52.6% | 46.1% |
| Weighted F1 | 92.3% | 47.0% | 38.0% |
| Latency / request | 0.5ms | 0.6ms | 37.6ms |
| MEDIUM F1 | 88.7% | 43.6% | 6.2% |
| REASONING F1 | 97.8% | 61.7% | 0.0% |
Simulated on a 131-request agent coding session and compared against always sending every request to anthropic/claude-opus-4.6.
| Metric | Always Opus | UncommonRoute |
|---|---|---|
| Total cost | $1.7529 | $0.5801 |
| Cost saved | — | 67% |
| Quality retained | 100% | 93.5% |
| Routing accuracy | — | 90.8% |
cd ../router-bench && python -m router_bench.run├── uncommon_route/ # Core package
│ ├── router/ # Classifier + selector + model table
│ ├── proxy.py # ASGI proxy (OpenAI + Anthropic endpoints)
│ ├── session.py # Session persistence + escalation
│ ├── spend_control.py # Spending limits
│ ├── providers.py # BYOK provider management
│ ├── feedback.py # Online feedback loop
│ ├── composition.py # Tool-result compaction / checkpointing
│ ├── artifacts.py # Local artifact storage
│ ├── stats.py # Routing analytics
│ └── static/ # Built dashboard assets
├── frontend/dashboard/ # Dashboard source
├── openclaw-plugin/ # OpenClaw integration
├── tests/ # Unit + integration + end-to-end tests
├── bench/ # Benchmark data and training scripts
├── scripts/install.sh # Installer
└── pyproject.toml # Packaging and dependencies
git clone https://github.com/anjieyang/UncommonRoute.git
cd UncommonRoute
pip install -e ".[dev]"
python -m pytest tests/ -vMIT — see LICENSE.