Limits and quotas

This page documents the limits Mate enforces on jobs and tenants. Most limits are configurable in your tenant config or .mate.yml; a few are fixed by your plan.

Concurrency

Scope	Config key	Default
Per-tenant concurrent jobs	Plan capacity (number of “mates” on your plan)	Depends on plan
Per-tenant job cap via policy	`policies.concurrency`	`2`

Your plan determines how many jobs can run simultaneously across your entire tenant. policies.concurrency enforces an additional tenant-wide ceiling — useful to limit resource consumption or prevent agents from conflicting with each other.

When the policies.concurrency limit is reached, new matching events are queued (they wait as pending jobs). When the per-tenant pending-queue caps fill up, excess events are rejected (see Queue backpressure below).

policies:
  concurrency: 1    # only one job at a time across the tenant

Job timeout

Config key	Default	Scope
`agents.<name>.timeout`	Inherits `policies.default_timeout`	Per-agent
`policies.default_timeout`	1 hour (when neither `timeout` nor `default_timeout` is set)	Per-tenant/repo

A job that runs longer than its timeout is canceled with reason timeout. Set timeout: on an individual agent to override the default for that agent:

agents:
  slow-refactor:
    backend: claude-code
    model: anthropic/claude-opus-4-7
    timeout: 60m
    # ...

No-progress watchdog

A job with no activity for 5 minutes is automatically canceled with reason timeout. “Activity” means any event received from the agent — a tool call, a log line, or a status transition.

This watchdog catches stuck agents that are still technically running but have made no progress. It cannot be disabled.

Cooldown between repeat triggers

policies.cooldown sets the minimum time that must elapse between consecutive job dispatches on the same repository. This prevents a fast-looping comment thread from spawning a cascade of jobs:

policies:
  cooldown: 5m    # no more than one job every 5 minutes per repo

Default is no cooldown (each matching event dispatches immediately).

Per-job LLM spend cap (OpenRouter)

When you connect an OpenRouter provisioning key, you can set a per-job USD spend limit:

openrouter_default_limit_usd: 2.00

The cap is enforced server-side by OpenRouter: once the per-job key’s limit is reached, OpenRouter stops authorizing requests. The job then fails with LLM errors as the agent can no longer call the model. Mate does not auto-cancel the job with a cap_hit reason, and the per-job key is not revoked immediately — keys are cleaned up after the job’s record expires. This applies to all jobs across your tenant — there is no per-agent spend cap configuration in v1.

In passthrough mode (no provisioning key), Mate does not enforce spend caps. Use your provider’s own rate limiting.

Queue backpressure

Mate queues incoming events up to a per-tenant limit. When the queue is full, new webhook deliveries are rejected with a 503 response. GitLab interprets repeated 503 responses as a degraded webhook and may temporarily disable it.

In practice, this limit is only reached under sustained high event volume with slow or stuck jobs. Resolving the underlying cause (jobs completing, cooldown configured, concurrency tuned) clears the backlog automatically.

Container resources

Container resources are determined by your plan tier and are not configurable per-agent.

Plan	vCPU	Memory
Starter	1	2 GiB
Plus	1	2 GiB
Enterprise	2	4 GiB

The same resource allocation applies to both the agent and the container image build step (when building a devcontainer). If your devcontainer builds are memory-heavy, consider the Enterprise tier.

On-prem deployments use the customer’s own hardware; the resource allocation above does not apply.