Features

Sub-agent

Delegate a self-contained task to a focused worker model mid-generation — the worker sees only what you give it and returns its final result.

Sub-agent is a server tool: BitRouter runs it mid-generation instead of handing the call back to your client. The calling model hands off a self-contained task — task_name and task_description — to a focused worker model (typically a cheaper or faster one), which works in isolation and returns only its final result. Use it to fan out grunt work without spending the main model's context on it.

Quick start

Enable the worker by declaring subagent on the request tools array. The declaration pins the worker model and instructions; at call time the model fills in the task:

curl https://api.bitrouter.ai/v1/responses \
  -H "Authorization: Bearer $BITROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-opus-4.8",
    "input": "Summarize each of these 20 support tickets, then group them by theme.",
    "tools": [
      {
        "type": "subagent",
        "model": "anthropic/claude-haiku-4.5",
        "instructions": "You are a concise summarizer. Return one sentence per ticket."
      }
    ]
  }'
curl https://api.bitrouter.ai/v1/messages \
  -H "Authorization: Bearer $BITROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-opus-4.8",
    "max_tokens": 4096,
    "messages": [
      {"role": "user", "content": "Summarize each of these 20 support tickets, then group them by theme."}
    ],
    "tools": [
      {
        "type": "subagent",
        "name": "subagent",
        "model": "anthropic/claude-haiku-4.5",
        "instructions": "You are a concise summarizer. Return one sentence per ticket."
      }
    ]
  }'

Protocol support. The subagent declaration works on the OpenAI Responses API (/v1/responses) and the Anthropic Messages API (/v1/messages), which both carry server tools on the wire. The Chat Completions tools array only accepts {type:"function"} entries and cannot carry the declaration.

How it works

When the calling model invokes the tool, it supplies:

ArgumentDescription
task_nameA short identifier for the task.
task_descriptionThe full, self-contained task: context, inputs, and expected output.

The worker model sees only the task_description — no other conversation context — runs the task, and its final result is returned to the calling model. Because the worker is isolated, the calling model must put everything the worker needs into the description.

Configuration

Declared on the tools-array entry:

FieldTypeDescription
modelstringThe worker model. Defaults to the parent request model.
instructionsstringSystem instructions for the worker.
toolsarrayProvider server tools the worker may use (e.g. web search), in provider-namespaced declaration form.

Bounded by the server-tool loop. A sub-agent turn runs inside a bounded loop — a default of 10 tool rounds, 30s per tool, and a 120s total budget per turn.

On Cloud

Sub-agent is enabled and managed on BitRouter Cloud, with per-run cost for the worker calls visible in the request log. Self-hosters enable it with server_tools.subagent: true; it is then advertised per-request only when the caller declares it.

See also

  • Advisor — consult a stronger model for guidance instead of delegating a task
  • Fusion — deliberate across a panel of models on one prompt
  • Provider selection — control which provider serves the worker model

How is this guide?

On this page