Unsloth
Serve a model with Unsloth Studio and register its OpenAI-compatible endpoint as a BitRouter provider.
Unsloth is best known for fast fine-tuning, and Unsloth Studio also serves models locally behind an OpenAI-compatible API — handy for routing a model you just fine-tuned, or any Unsloth GGUF. The server exposes /v1/chat/completions (and an Anthropic /v1/messages surface), which BitRouter fronts as one provider block.
Prerequisites
-
BitRouter installed, with a
bitrouter.yaml(scaffold one withbitrouter init). -
Unsloth Studio serving a model:
unsloth run --model unsloth/gemma-3-27b-it-GGUF:UD-Q4_K_XLThis starts the server, opens the Studio UI, and prints your endpoint URL and
sk-unsloth-…API key — note both. The port is typically8000or8888.
Add Unsloth to BitRouter
# bitrouter.yaml
providers:
unsloth:
api_base: http://localhost:8000/v1 # use the port Studio printed
api_protocol:
- "*": chat_completions
api_key: ${UNSLOTH_API_KEY}
models:
- id: unsloth/gemma-3-27b-it-GGUFUnsloth requires a key. Unlike Ollama / vLLM / LM Studio, Unsloth Studio authenticates every request with an Authorization: Bearer sk-unsloth-… header. Export the key it printed and reference it in the block:
export UNSLOTH_API_KEY=sk-unsloth-xxxxxxxxxxxxapi_key resolves from the environment at load time.
Confirm the exact model id the server reports — it's what goes under models:
curl http://localhost:8000/v1/models -H "Authorization: Bearer $UNSLOTH_API_KEY"Port clash with vLLM. Unsloth Studio and vLLM both default to :8000. If you run both, point api_base at whichever port Studio actually printed.
Route to it
bitrouter route unsloth:unsloth/gemma-3-27b-it-GGUFThen start BitRouter and send a request.
Learn more
How is this guide?