Spend Controls

Spend controls let agents launch GPU work without giving them unlimited spend.

They do not kill running jobs. They only decide whether queued jobs are allowed to dispatch. If a cap is hit, new jobs stay Queued; running jobs continue.

Guardrail	Use it to	Measures	Blocks at	Clears
Throttle	Limit concurrent GPU jobs	estimated live `$/hr`	running + next VM >= cap	when VMs finish
Budget	Limit total spend	settled + estimated spend	period spend >= cap	at UTC reset

Use both for autonomous agents: throttle controls how many GPU jobs can run at the same time, and budget controls total exposure over a day, week, or month.

Quick Setup

For Claude Code, Codex, Cursor, Aider, and similar agent sessions, start with per-session caps:

anycloud throttle set 300 --agent-session
anycloud budget set 4000 --per day --agent-session
anycloud spend show

--agent-session gives each detected agent run its own cap. One session hitting its cap does not block another session.

Add account-wide caps when you also want a global ceiling across humans and agents:

anycloud throttle set 500
anycloud budget set 10000 --per month

Throttle: Limit Concurrent Jobs

Throttle is the live burn-rate guardrail. It prevents too many GPU jobs from running at the same time by estimating current $/hr.

Before a queued job dispatches, the scheduler checks:

running VM $/hr + next VM $/hr >= throttle cap

If that check is true, the job stays queued. When running VMs finish, live $/hr drops and queued jobs can dispatch on the next scheduler tick.

anycloud throttle set 20                  # account-wide: $20/hr
anycloud throttle set 5 --agent-session   # each agent session: $5/hr
anycloud throttle show
anycloud throttle unset
anycloud throttle unset --agent-session

Two throttle caps can coexist: account-wide and per agent-session.

Budget: Limit Total Spend

Budget is the calendar-window guardrail. It caps total spend over a day, week, or month.

Budget compares window-to-date spend against the cap:

settled spend + estimated spend >= budget cap

Settled spend comes from cloud billing APIs. Estimated spend uses catalog $/hr for active or not-yet-settled VMs.

anycloud budget set 100 --per day
anycloud budget set 500 --per week
anycloud budget set 2000 --per month
anycloud budget set 50 --per day --agent-session
anycloud budget show
anycloud budget unset --per day

Windows reset at UTC boundaries:

day - 00:00 UTC
week - Monday 00:00 UTC
month - 1st of month, 00:00 UTC

Up to six budgets can coexist: three windows times two scopes. When multiple budgets are active, the lowest one binds.

What Happens When A Cap Is Hit

anycloud submit still accepts the job and returns a deployment id. Spend controls are enforced later, when the scheduler tries to move the job from Queued to provisioning.

A blocked deployment stays Queued with a reason visible in anycloud status and anycloud list:

Throttle: blocked by throttle: rate $X/hr >= cap $Y/hr
Budget: blocked by daily|weekly|monthly budget: spent $X / $Y - resumes YYYY-MM-DD HH:MM UTC

Budget messages name the next UTC reset. Throttle has no fixed reset time because it clears when running VMs finish.

If Slack notifications are enabled, anycloud posts a deployments waiting on spend cap alert the first time a cap starts blocking, then at most once every 6 hours while it stays blocked. The alert is one message per cap, not one message per queued job.

Scopes

account - counts every deployment, including interactive anycloud submit from a human shell.
agent-session - counts only deployments from one detected agent run. Each session has its own counter.

Interactive submits have no session, so they are not bound by agent-session caps. Pair per-session caps with account-wide caps when you want to bound humans too.

Auto-detection of agent sessions lives in the CLI. See Agents.

Inspect Headroom

Use anycloud spend show to see throttle and budget headroom in one view:

anycloud spend show
anycloud spend show --json

Throttle rows show live estimated $/hr and remaining hourly headroom. Budget rows show total spend, remaining budget, and settled-vs-estimated breakdown when estimates are present.

Details And Edge Cases

Throttle pre-charges the candidate. The next VM's estimated $/hr is added before the scheduler decides whether to dispatch it. This prevents a burst of queued jobs from all dispatching against the same pre-burst rate.
Throttle is estimated. It uses catalog hourly prices for running VMs and the candidate. If the scheduler cannot price the candidate, throttle falls back to the current live rate.
Budget is reactive. It does not pre-charge the candidate. Calendar windows are large enough that a tick's overshoot is usually negligible.
Running jobs continue. Both caps only gate the queued-to-provisioning transition.
Polling, not events. The scheduler rechecks queued jobs every few seconds and dispatches automatically once the blocking cap clears.

What's Not Included Yet

Threshold alerts at 50%, 75%, 90%, or 100%.
Forecasting.
Webhooks on threshold.
Pause or kill of running jobs.
Scopes beyond agent-session such as per-credential, per-region, or per-agent-name.
Per-job duration estimates or a required --max-duration.

Quick Setup​

Throttle: Limit Concurrent Jobs​

Budget: Limit Total Spend​

What Happens When A Cap Is Hit​

Scopes​

Inspect Headroom​

Details And Edge Cases​

What's Not Included Yet​