Skip to content

Models and BYOK

Mate does not operate an LLM proxy or mark up token costs. You connect your own LLM provider key, and jobs consume tokens directly against your account. Mate’s billing covers compute capacity (concurrent agent slots) only.

Billing v1 is BYOK-only. You bring an OpenRouter or Anthropic key; Mate passes it through to the agent without any surcharge. The console shows a “Managed LLM” option that will be available after general availability — it is not active today.

You own the provider relationship. Per-job spend caps (described below) are the mechanism for controlling cost on your side.

The model: field is set on each agent individually. This means different triggers can use different models:

agents:
triage:
backend: claude-code
model: anthropic/claude-haiku-4-5 # fast, cheap — for labelling and triage
system_prompt: |
Triage this issue: add a priority label and a one-sentence summary.
permissions: [read, comment]
reviewer:
backend: claude-code
model: anthropic/claude-opus-4-7 # strong — for code review
system_prompt: |
You are a senior code reviewer. Be thorough.
permissions: [read, comment]
triggers:
- on: issue_assigned
agent: triage
- on: mr_opened
agent: reviewer

Model IDs depend on which provider you connect.

OpenRouter — use the full provider-prefixed model ID as listed in the OpenRouter model catalog:

anthropic/claude-sonnet-4-6
anthropic/claude-opus-4-7
anthropic/claude-haiku-4-5

Anthropic direct — use the bare Anthropic model ID:

claude-sonnet-4-6
claude-opus-4-7
claude-haiku-4-5

Model IDs are not validated at config-load time. When an event arrives, Mate performs a best-effort check against OpenRouter’s public model catalog (this check is silently skipped if the catalog cannot be fetched); Anthropic-direct IDs are not validated against any Anthropic list at all. In practice, copy the model ID exactly as it appears in your provider’s documentation — a wrong ID typically surfaces as job errors at run time, not at config save time.

Add your LLM API key in the console under Settings → LLM credentials. The key is stored encrypted (AES-256-GCM) and is write-only — once saved, the console never displays it again. To rotate the key, overwrite it by saving a new value.

OpenRouter sub-keys and per-job spend caps

Section titled “OpenRouter sub-keys and per-job spend caps”

When you connect an OpenRouter key with provisioning rights (an OpenRouter Management API key), Mate mints a short-lived sub-key for each job at dispatch time. The sub-key is scoped to a configurable per-job USD spend limit:

# In tenant config (set via console):
openrouter_default_limit_usd: 2.00 # per-job cap; null = no cap

When the cap is reached, OpenRouter enforces the limit server-side by refusing further requests — the job fails with provider errors at that point. Mate does not maintain a separate spend monitor or issue a cap_hit cancel. The sub-key is revoked by background cleanup after the job’s record expires. Your OpenRouter provisioning (management) key never enters the agent’s container.

If you do not configure a provisioning key, your LLM credential is passed through to the agent directly (passthrough mode). In this mode your provider API key is the credential the agent process uses inside the container — that is the design. What does not enter the container is the OpenRouter provisioning key (used only by Mate’s control plane to mint sub-keys) and your GitLab bot PAT. In passthrough mode, per-job spend caps are not enforced by Mate — you control cost through your provider’s own rate limits and quotas.

ModeHow to activatePer-job sub-keySpend cap enforcement
OpenRouter with provisioning keySet Management API key in consoleYes — minted per job, revoked on finishYes — via OpenRouter sub-key limit
OpenRouter passthroughSet regular OpenRouter key in agent envNoNo — provider-side only
Anthropic directSet Anthropic key in agent envNoNo — provider-side only

Update the model: field in your agent definition (via the console config editor or by committing a change to .mate.yml). The change takes effect on the next job dispatch — in-flight jobs are not affected.