Provider Selection
Choose how BitRouter ranks providers when a model is served by more than one — by price, latency, or throughput.
Most models on BitRouter are served by more than one provider. When you request openai/gpt-4o, BitRouter has to pick which registered endpoint to send the request to. By default it uses a balanced score; with the provider.sort field, you choose the policy explicitly.
There are three policies. Pick whichever matters most for the request.
The three policies
| Policy | Optimizes for | Tie-break |
|---|---|---|
price | Lowest cost per request, computed against your prompt and expected completion tokens at current upstream pricing. | Higher uptime → lower error rate → provider ID. |
latency | Lowest observed p50 TTFT (time to first token) over the rolling 1-hour window. | Higher throughput → higher uptime → provider ID. |
throughput | Highest observed output tokens per second over the rolling 1-hour window. | Lower TTFT → higher uptime → provider ID. |
Telemetry is refreshed every minute. The same data is visible on each model's page in the registry.
Quick example
curl http://localhost:8787/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"provider": { "sort": "latency" },
"messages": [{"role": "user", "content": "Translate to French: Hello."}]
}'The same provider.sort field works on /v1/messages (Anthropic) and /v1beta/models/{model}:generateContent (Google).
BYOK providers come first
If you've added an external key for a provider, BitRouter prefers that provider for any model it can serve — ahead of every non-BYOK provider, regardless of provider.sort. Your BYOK key bills against your own account at upstream list price with no rev share, and you opted into that provider explicitly; honoring that opt-in by default is the only choice that doesn't surprise you later.
Within the BYOK-eligible set, the provider.sort policy still applies. So provider.sort: "latency" plus BYOK keys for OpenAI and Anthropic ranks those two by TTFT first, and falls back to non-BYOK providers (also ranked by latency) only if both BYOK paths fail.
In local mode this section is a no-op — every provider is BYOK by definition.
Default behavior
When provider is not set, BitRouter ranks by a balanced score — a weighted combination of price, latency, throughput, and uptime, with low-uptime providers filtered out. This is the right default for most agents; specify a policy only when one axis dominates.
The default is not stable across versions. The weights in the balanced score are tuned over time as we learn from real traffic. If you need a fixed, reproducible policy — for cost reporting, SLO tracking, or A/B tests — set provider.sort explicitly.
How selection composes with fallback
Model fallback and provider selection are independent layers:
- For each model in your
modelslist (or the singlemodelif no fallback), BitRouter applies yourprovider.sortpolicy to pick the best provider. - If the chosen provider fails in a way that doesn't surface to the caller (rate limit, 5xx), BitRouter retries on the next-ranked provider of the same model before falling through to the next model in the list.
- The same
provider.sortpolicy applies to every model in the fallback list — you cannot specify a different policy per model.
Concretely: models: ["openai/gpt-4o", "anthropic/claude-sonnet-4-6"] with provider.sort: "price" evaluates the cheapest provider of GPT-4o first, then the cheapest provider of Sonnet, then surfaces the error.
When metrics are tied
If two providers price the same prompt identically, the higher-uptime one wins. If uptime is also tied, the lower-error-rate one wins. If everything is tied, BitRouter sorts by provider ID lexicographically — deterministic and audit-friendly, but it does not "load balance." If even spend distribution across tied providers matters for your workload, post a use case to Discord; we'll add a provider.balance knob if there's demand.
What's not here
OpenRouter exposes a much larger surface — provider.order, provider.allow_fallbacks, provider.require_parameters, provider.data_collection, provider.ignore, provider.quantizations, and more. We are deliberately keeping this to one knob with three values until usage tells us otherwise. Two equivalent expressions if you're migrating:
- Pin to a specific provider — use the provider-prefixed model ID, e.g.
model: "anthropic-direct/anthropic/claude-sonnet-4-6". - Exclude a provider — omit it from your workspace's registry allowlist, not the request body.
If a missing knob is blocking a real workload, file an issue on bitrouter.
How is this guide?
Last updated on