Limits and quotas
This page documents the limits Mate enforces on jobs and tenants. Most limits are configurable in your tenant config or .mate.yml; a few are fixed by your plan.
Concurrency
Section titled “Concurrency”| Scope | Config key | Default |
|---|---|---|
| Per-tenant concurrent jobs | Plan capacity (number of “mates” on your plan) | Depends on plan |
| Per-tenant job cap via policy | policies.concurrency | 2 |
Your plan determines how many jobs can run simultaneously across your entire tenant. policies.concurrency enforces an additional tenant-wide ceiling — useful to limit resource consumption or prevent agents from conflicting with each other.
When the policies.concurrency limit is reached, new matching events are queued (they wait as pending jobs). When the per-tenant pending-queue caps fill up, excess events are rejected (see Queue backpressure below).
policies: concurrency: 1 # only one job at a time across the tenantJob timeout
Section titled “Job timeout”| Config key | Default | Scope |
|---|---|---|
agents.<name>.timeout | Inherits policies.default_timeout | Per-agent |
policies.default_timeout | 1 hour (when neither timeout nor default_timeout is set) | Per-tenant/repo |
A job that runs longer than its timeout is canceled with reason timeout. Set timeout: on an individual agent to override the default for that agent:
agents: slow-refactor: backend: claude-code model: anthropic/claude-opus-4-7 timeout: 60m # ...No-progress watchdog
Section titled “No-progress watchdog”A job with no activity for 5 minutes is automatically canceled with reason timeout. “Activity” means any event received from the agent — a tool call, a log line, or a status transition.
This watchdog catches stuck agents that are still technically running but have made no progress. It cannot be disabled.
Cooldown between repeat triggers
Section titled “Cooldown between repeat triggers”policies.cooldown sets the minimum time that must elapse between consecutive job dispatches on the same repository. This prevents a fast-looping comment thread from spawning a cascade of jobs:
policies: cooldown: 5m # no more than one job every 5 minutes per repoDefault is no cooldown (each matching event dispatches immediately).
Per-job LLM spend cap (OpenRouter)
Section titled “Per-job LLM spend cap (OpenRouter)”When you connect an OpenRouter provisioning key, you can set a per-job USD spend limit:
openrouter_default_limit_usd: 2.00The cap is enforced server-side by OpenRouter: once the per-job key’s limit is reached, OpenRouter stops authorizing requests. The job then fails with LLM errors as the agent can no longer call the model. Mate does not auto-cancel the job with a cap_hit reason, and the per-job key is not revoked immediately — keys are cleaned up after the job’s record expires. This applies to all jobs across your tenant — there is no per-agent spend cap configuration in v1.
In passthrough mode (no provisioning key), Mate does not enforce spend caps. Use your provider’s own rate limiting.
Queue backpressure
Section titled “Queue backpressure”Mate queues incoming events up to a per-tenant limit. When the queue is full, new webhook deliveries are rejected with a 503 response. GitLab interprets repeated 503 responses as a degraded webhook and may temporarily disable it.
In practice, this limit is only reached under sustained high event volume with slow or stuck jobs. Resolving the underlying cause (jobs completing, cooldown configured, concurrency tuned) clears the backlog automatically.
Container resources
Section titled “Container resources”Container resources are determined by your plan tier and are not configurable per-agent.
| Plan | vCPU | Memory |
|---|---|---|
| Starter | 1 | 2 GiB |
| Plus | 1 | 2 GiB |
| Enterprise | 2 | 4 GiB |
The same resource allocation applies to both the agent and the container image build step (when building a devcontainer). If your devcontainer builds are memory-heavy, consider the Enterprise tier.
On-prem deployments use the customer’s own hardware; the resource allocation above does not apply.