Ollama is the easiest way to run open models locally. It serves an OpenAI-compatible API at http://localhost:11434/v1, so it drops into bitrouter.yaml as one provider block.

Prerequisites

BitRouter installed, with a bitrouter.yaml (scaffold one with bitrouter init).

Ollama running, with a model pulled:

ollama serve &          # default port 11434
ollama pull llama3.1

Add Ollama to BitRouter

# bitrouter.yaml
providers:
  ollama:
    api_base: http://localhost:11434/v1
    api_protocol:
      - "*": chat_completions
    models:
      - id: llama3.1

Each models entry is a name you've pulled with ollama pull. List what's available with ollama list.

No API key. Ollama accepts anonymous loopback requests, so the block has no api_key. (Its own SDK examples pass a dummy "ollama" key only because the OpenAI client library demands a non-empty string — BitRouter doesn't.)

Route to it

bitrouter route ollama:llama3.1

Then start BitRouter and send a request. Use the provider-qualified id ollama:llama3.1 to pin the request, or the bare llama3.1 to let BitRouter cascade.

Learn more

Ollama — OpenAI compatibility
Model fallback — fail over from local Ollama to a hosted model.

Ollama

Prerequisites

Add Ollama to BitRouter

Route to it

Learn more

On this page