Job Lifecycle
anycloud submit
│
▼
┌─────────────┐
│ queued │ ⏳ Waiting for a provisioning slot or spend cap
└──────┬──────┘
│
▼
┌─────────────┐
│ provisioning │ 🏗️ Creating VMs
└──────┬──────┘
│
▼
┌──────────────┐
│ initializing │ ⚙️ SSH + packages + setup
└──────┬───────┘
│
▼
┌──────────────┐
│ downloading │ 📥 Docker image pull
└──────┬───────┘
│
▼
┌─────────────┐
│ syncing │ 💾 Bucket download (if configured)
└──────┬──────┘
│
▼
┌─────────────┐
│ starting │ 🚀 Container start + health check (+ git clone for @function)
└──────┬──────┘
│
▼
┌─────────────┐
│ running │ ⚡ Job is executing
└──────┬──────┘
│
┌───┴───┐
│ │
│ exit ≠ 0
│ ▼
│ ┌─────────┐
│ │ errored │ 🪲
│ └─────────┘
│
exit = 0
▼
┌───────────┐
│ completed │ ✅
└───────────┘
For jobs using the @anycloud.function() decorator, the container's startup command clones your GitHub repo at the submitted commit before running your function. This happens during the starting phase — your Docker image only needs dependencies, not your code. See Deployment Workflows.
Provisioning Concurrency
anycloud caps the deploy pipeline at 50 deployments simultaneously setting up VMs per local API instance. A deployment counts against this cap while it's in provisioning, initializing, downloading, syncing, or starting — i.e. between dispatch and reaching running.
Submit more than 50 at once and the excess sits in queued until a slot frees. Slots free as soon as a deployment reaches running (not when the job completes), so throughput is roughly 50 / pipeline_duration new VMs per minute.
This cap throttles the deploy pipeline, not the running fleet. There is no upper bound on how many jobs can be in running simultaneously.
A deployment can also sit in queued because a spend control — throttle or budget — is at its cap. The block reason shows up in anycloud status and anycloud list; the deployment auto-dispatches on the next scheduler tick once the cap clears.
Image Caching
The first time you deploy an image to a region, anycloud pulls it normally. After the pull completes, a snapshot of the VM disk is captured in the background — creating a pre-baked VM image with Docker layers already on disk.
On subsequent deploys of the same image, anycloud launches VMs from the pre-baked image. The download step completes instantly — no pull needed. This also saves on Docker Hub and GitHub Container Registry costs, since you only pull once per image per region regardless of how many times you deploy.
Images are keyed by Docker manifest digest, not tag. Pushing new content behind the same tag produces a different digest, so you always get the correct version. Old images for the same tag are automatically cleaned up.
Fully automatic — no configuration required.
Error States
Any step can fail along the way:
- ❌ Failed — infrastructure error (VM never started). Automatically retried up to 3 times via the 🛠️ Retrying state (see below)
- 🪲 Errored — your app exited non-zero
- 🚫 Invalid — bad config (wrong bucket name, VM type, etc.) — never retried, fix and resubmit
- 🪦 Terminated — user-initiated via
anycloud terminate
Auto-Retry
When an infrastructure error occurs during any setup step (provisioning through starting), anycloud automatically retries. The failed VM is cleaned up, and the job is re-queued for a fresh attempt:
(any setup step fails) → 🛠️ Retrying → ⏳ Queued → 🏗️ Provisioning → ... → ⚡ Running
After 3 failed attempts, the job is marked ❌ Failed.
Quota Recovery
When the cloud denies capacity for the requested VM family (a quota or capacity error), the behavior depends on whether the region is pinned:
- Multi-region config (optimizer picks the region) — anycloud blocks that
(cloud, vmType, region)target for 30 minutes and keeps the deployment in 🛠️ Retrying. The optimizer skips the blocked target and tries a different region on the next attempt. - Pinned region (
region:set in config) — the deployment enters 🚫 Invalid immediately, since retrying the same exhausted region won't help.
Regions where the quota is literally 0 or where the subscription can't deploy at all (AWS region not opted in, Azure sponsorship/student/policy-restricted locations) get blocked for 6 hours instead of 30 minutes — these don't come back without a support ticket, so the optimizer rotates off them faster.
To raise the limit yourself, use the CLI:
anycloud quota request <vmType> --cloud <aws|azure>
anycloud quota status --cloud <aws|azure>
Dedup is built in: re-running quota request against a region with an existing open case returns SKIPPED along with the prior case's URL. Once the case is approved (or partially approved), the next retry — or a fresh resubmit — picks up the new limit.
Credential Recovery
When the cloud rejects a credential mid-provisioning (expired token, revoked key, auth-class error), the behavior depends on whether credentialsName is a single value or a fallback list:
- Single credential — the deployment enters 🚫 Invalid immediately. Retrying with the same broken credential won't help.
- Fallback list (
credentialsName: [aws-prod, gcp-prod]) — anycloud blocks the failing credential for 30 minutes and rotates to the next entry in the list, keeping the deployment in 🛠️ Retrying. Only auth-class failures trigger credential rotation; quota and capacity errors continue to rotate at the(cloud, vmType, region)level.
Spot Recovery
Spot instances can be preempted by the cloud provider at any time during initializing, downloading, syncing, starting, or running. anycloud detects preemption automatically and re-provisions from scratch:
(preempted) → 🔁 Recovering → ⏳ Queued → 🏗️ Provisioning → ... → ⚡ Running