Changelog

0.1.50

GPU jobs retry a not-ready host instead of failing

When a GPU machine's NVLink fabric hasn't finished initializing, a job can crash at startup with CUDA error 802 ("system not yet initialized") before your code runs. anycloud now treats this as a problem with the machine rather than a fault in your container, and retries the job on a fresh machine — which almost always clears it. It most often shows up on multi-GPU machines such as H100 pairs.

Lambda deployments reach newer regions and fall back across instance types

Lambda GPU deployments can now use regions that Lambda offers but that weren't yet in anycloud's catalog (for example us-southeast-1) — anycloud now reads Lambda's own catalog directly instead of a slower third-party snapshot, so newly available regions show up within hours. A GPU request such as H100:1 also falls back across equivalent instance types automatically, so a deployment lands on whichever is available instead of failing on the cheaper one.

Lazy image loading now uses eStargz

Large-image lazy loading now uses eStargz instead of SOCI v2. Publish your image in eStargz format — one docker buildx flag, or a nerdctl image convert, with no separate index artifact to manage — and anycloud starts it lazily where the runtime supports it, running the container while its layers are fetched on demand. anycloud status still marks the running event [lazy-loaded] or [full pull]; small or ordinary images pull as before.

0.1.49

Large images start sooner with SOCI v2

Large GPU and machine-learning images can spend most of their launch time just downloading. If your image is pushed to GHCR in SOCI v2 format, anycloud can start it lazily where the runtime supports it: the container runs while its layers are fetched on demand, so it doesn't wait for the whole image to download first. anycloud status marks the running event [lazy-loaded] when this happens, or [full pull] otherwise. Small or ordinary images pull as before.

Azure bucket sync data now avoids temporary VM storage

Azure VM-host deployments now place host-side input, output, and checkpoint bucket directories under /var/lib/anycloud/deployments on the managed OS disk instead of Azure's temporary /mnt resource disk, so bucket data survives VM deallocation during in-place spot recovery. Other providers keep the existing /mnt layout. The host paths are stored per VM so already-running deployments keep their original bucket directories after an API upgrade.

Region-pinned deployments retry capacity sooner

When a cloud reports insufficient capacity for an explicitly pinned region, anycloud schedules another launch attempt after about one minute instead of waiting behind the shared ten-minute capacity observation. Unpinned deployments keep the longer observation window so they can fail over without repeatedly probing an exhausted target, and regional and account quota protections remain unchanged. While waiting, anycloud status confirms that an automatic capacity retry is scheduled.

Lambda launch retries clean up duplicate instances

Lambda deployments now tag launched instances and sweep stale same-deployment instances around launch retries. This prevents duplicate or orphaned Lambda instances when the provider accepts a launch request but the client only sees a transient failure.

Workload image baking has been removed

Jobs and servers now always boot from a provider base VM and pull the resolved Docker image. The CLI and Python SDK no longer expose workload-image baking, cache reuse, or image warm-up controls, and status no longer includes a baking lifecycle phase. The entire anycloud baked command group, including its list, copy, and prune operations, has also been removed. Earlier changelog entries below describe the behavior of those historical releases.

Existing baked workload AMIs and their snapshots are not automatically discovered or deleted by this release. They remain in the owning cloud account until an account owner removes them with provider-native tooling; generic AnyCloud base/runtime images must be retained.

0.1.48

Preempted spot deployments no longer stay falsely `running`

If the API restarted while a cloud provider was bringing a preempted VM back online, the deployment could remain recorded as running even though its only VM was already recorded as preempted. Those stale deployments then appeared in anycloud list and anycloud terminate indefinitely. Recovery is now committed before waiting on the cloud restart, interrupted attempts resume through the normal recovery queue, and upgraded API servers automatically repair existing running deployments whose live VM records are all preempted.

Failing bucket syncs back off between retries

When a continuous output or checkpoint sync failed (expired storage token, revoked bucket access, network trouble), it retried immediately in a tight loop instead of waiting for the normal sync interval — hammering the storage API and burning VM CPU next to the workload until the problem cleared. Failed attempts now wait the same interval as successful ones. Applies to deployments launched after this release; already-running VMs keep the sync script they launched with.

Bucket sync health now comes from rclone's own errors

anycloud status derived bucket-sync health by keyword-matching service logs, so a harmless systemd warning could make a healthy output sync render as failed — both live and in a past deployment's event history. Sync health is now read from the rclone log itself and only rclone's explicit errors count as sync failures; the harmless warning is also stripped when showing older deployments' history.

Clearer terminal status for jobs with output buckets

Terminal events now show the final output sync result separately from the workload's own exit code (e.g. errored [exit: 137] [final sync: succeeded, attempt: 1]), a stopped sync service now shows as stopped instead of disappearing from status entirely, and sync diagnostics are no longer duplicated into termination events.

0.1.47

`anycloud update` actually restarts the local API server again

Since 0.1.46, anycloud update silently failed to restart a running local API container onto the new version: it printed the target version and exited as if it had succeeded, but the container was left on its old image. anycloud update now restarts it correctly. If a local API server has been stuck on an old version, run anycloud update again after upgrading.

Cross-cloud input buckets no longer get stuck in syncing

Deployments with a cross-cloud input bucket (e.g. compute on one provider, an S3 input bucket on another) could get stuck retrying in syncing indefinitely. The pre-sync connectivity check now verifies read access for input buckets instead of attempting a write, matching the read-only credentials those buckets are already granted.

0.1.46

Bake-hit boots wait for the warm-up, and it targets the right files

On a cache-hit boot from a baked image, the container now waits (up to 60 seconds) for the boot-time page-cache warm-up to finish before starting, so the workload's first reads land warm instead of racing an in-progress warm-up. The warm-up itself also skips large files a running container never actually reads — copies and caches left over from pulling and building the image, along with libraries the image ships but doesn't use at runtime — so its limited time budget goes to the files that matter. Re-bake existing images to pick up both changes.

0.1.45

Faster initialization on baked-image boots

VM initialization now confirms Docker readiness through the init system's native ready signal instead of repeatedly querying the daemon, reaches new VMs with a connection-level probe before attempting SSH logins, and the boot-time warm-up reads the container runtime's bookkeeping first. Together these decouple initialization from the background warm-up: multi-minute initialization on large baked images drops to under a minute. The readiness and connection improvements apply to all deployments immediately; re-bake to pick up the warm-up ordering change.

0.1.44

Boot-stable warm-up for large baked images

The boot-time warm-up introduced in 0.1.42 now reads boot-critical files first and large image data last, reducing contention with VM startup. For large images, pair --bake with --disk-tier high: baked-image boots are fastest there and model loads start warm. On the default disk tier, booting very large baked images can still delay or retry provisioning — prefer a normal pull (--no-bake-cache) or the high tier for those. Re-bake existing images to pick up the new warm-up ordering.

Target selection

When the interactive submit and serve VM picker can't fetch quota limits in time, it now says so and shows the full catalog, instead of silently dropping the quota-aware ranking. Quota lookups are also cached longer, so repeat picks and anycloud quota status checks return faster.

0.1.43

Baked-image warm-up now covers GPU base images

The boot-time warm-up introduced in 0.1.42 read only Docker's classic storage directory. GPU base images store cached layers through Docker's containerd image store, which keeps its data in a separate location — so GPU-family bake hits warmed almost nothing and still started cold. The warm-up now reads both storage locations. Re-bake existing images to pick up the fix; the bake e2e suite now also verifies that the warm-up did proportional work on a live bake-hit VM.

0.1.42

Faster container starts on baked-image cache hits

VMs booted from a baked image now read the cached Docker image into memory in the background at boot, restoring the warm state a fresh Docker pull leaves behind. Large images (model weights, CUDA libraries) previously paid cold-disk reads on first access, which could make a cache-hit start slower than a fresh pull. Applies to images baked from this version onward; older baked images start cold until re-baked.

AWS disk tiers

anycloud submit, anycloud serve, cloud config, and the Python SDK now support an AWS diskTier / --disk-tier option for choosing gp3 root-volume performance independently from root disk capacity. The initial tiers are medium, high, and ultra.

On baked-image cache hits, high and ultra also request the fastest supported EBS volume initialization rate for the root volume. Workloads that still start faster with a normal Docker pull should use --no-bake-cache; disk tiers are a root-disk performance knob, not a guarantee that prebaked AMIs beat registry pulls.

gp3 root volumes by default

AWS deployments now always launch with a gp3 root volume. Previously, deployments that did not set a disk size or disk tier could fall back to the base image's gp2 default, which is slower — especially at small disk sizes — and 25% more expensive per GB.

Baked-image cache opt-out

anycloud submit, anycloud serve, anycloud serve upgrade, and the Python SDK now support a no-bake-cache option for runs that should force a normal base VM boot and Docker pull even when a matching baked image exists. It can be combined with legacy --bake to pull normally and refresh the baked image after the pull.

Release hardening

Unit test runtime was reduced for status and Azure gateway coverage, keeping the release validation path faster without reducing behavioral checks.

0.1.40

AWS baked-image startup reliability

AWS bake-hit launches now request a bounded EBS volume initialization rate for the root volume, reducing unpredictable snapshot lazy-load latency when booting from prebaked AMIs. If AWS rejects that optimization because of regional quota or capacity pressure, provisioning retries without it instead of failing the deployment.

The AWS serve-bake integration test now verifies that bake-hit VMs launch from the baked image and request the bounded initialization rate while the instance is still live.

Local Kubernetes release hardening

The local Kubernetes provider contract now uses a Never pull policy for the locally built SSH sidecar image after importing it into k3d, preventing CI from falling back to Docker Hub for a test-only image tag.

0.1.39

Bake reliability

Explicit bake now flushes Docker filesystem state before snapshotting and records whether bake failures happened during flush or image creation. Failed bakes no longer fall through into workload execution, and same-tag baked images with different digests are kept until an explicit prune removes them.

Quota and status

anycloud quota status now prompts for spot or on-demand quota in interactive mode and supports explicit --spot / --on-demand filters while keeping JSON output script-friendly. Quota integration coverage now checks AWS and Azure provider parity against cloud API data.

Status telemetry is now throttled per user and deployment so watch and wait polling loops do not flood analytics.

Updates and release hardening

anycloud upgrade is now an alias for anycloud update, with clearer output when the CLI is current and there is no local API server to restart. Homebrew-managed installs now update through Homebrew from anycloud update; when Homebrew is already on its latest formula, the command still restarts a stale local API server to match the installed CLI.

Nightly bake e2e jobs have longer timeout headroom and stricter phase checks, so release-adjacent bake regressions surface before publishing.

0.1.38

Quota and status internals

Quota resource resolution now uses shared AWS and Azure resolvers, improving quota lookup consistency across provider gateways, API handlers, and the CLI.

Internal provider status domains now use enums across API and CLI paths, keeping status handling consistent as quota and deployment flows evolve.

0.1.37

Local Docker-control

AnyCloud can now run Local workloads through a Docker-control container on the API host, using the same workload path as VM deployments.

anycloud logs works for Local deployments, streaming through the API without exposing SSH material. Local missing-GPU failures now stop instead of retrying the same fixed host.

Serve upgrades

anycloud serve upgrade <id> <image> [command...] replaces a running serve deployment while preserving its existing https://<id>.anycloud.sh URL.

Target selection

The interactive submit and serve VM picker now shows quota-aware ranking and spot pricing while keeping region selection unpinned.

Image inspection now filters CUDA/ROCm-incompatible GPU fallback targets and rejects images whose compressed layers cannot fit on the selected disk.

Baked images and registries

Private GHCR manifest inspection now reports real auth failures for --bake instead of falling back to misleading digest errors.

anycloud baked list has shorter age formatting, more stable picker columns, and safer regional caching that avoids reusing partial scans.

Local API reliability

The CLI now discovers local API containers on non-default host ports. anycloud update checks /v1/version, verifies the target API image before stopping the old container, and can recover a failed restart on the next update.

Operational hardening

Catalog refresh falls back to usable cache when available. Lambda serve rotates away from regions without required firewall rulesets. Container startup health now catches unhealthy, restarting, and missing-GPU states before publishing endpoints.

0.1.36

Choose which API server the CLI talks to

The CLI now remembers a preferred API server, so you don't need to set API_URL on every command.

anycloud api use <url> saves a hosted API as the active server (stored in ~/.anycloud/api-url). Pass a deployment ID to pick one of your hosted APIs, local to return to http://localhost:8080, or run it with no argument for an interactive picker of healthy, compatible servers.
anycloud api list shows your local and hosted API servers with their health, version, and which one is active. --json emits the same data for scripting.
anycloud api info shows the active server, where the setting came from, and whether its version matches your CLI.
anycloud api serve --use makes a newly created hosted API active once it's healthy.

API_URL still works as a one-command override and takes precedence over the saved setting.

CLI and API versions must match

The CLI and the API server it talks to must now be the same version. If they differ, commands fail fast with a clear message — run anycloud update for a local API, or anycloud api list then anycloud api use <target> to switch to a matching server.

`anycloud logs` for VM-host deployments

anycloud logs [id] prints workload container stdout/stderr logs for VM-host deployments, and anycloud logs [id] --watch streams them. Provider-run-image deployments such as Vast are reported as unsupported for now.

`anycloud list` distinguishes jobs and servers

anycloud list now includes a TYPE column and a type-aware summary, so mixed job/server history is easier to scan. Use anycloud list --type job or anycloud list --type server to filter. --json includes deploymentType, and --csv keeps the existing columns in order while appending deploymentType as the final column.

Credential prompts are more consistent

Commands that need saved credentials now share the same resolver behavior: one eligible credential is selected automatically, multiple eligible credentials open the same autocomplete picker on a TTY, and non-interactive runs fail with a message naming the flag to pass.

Bare interactive anycloud submit, anycloud serve, and anycloud api serve now prompt for saved credentials before opening the compute target picker, then pin the selected credential on the deployment. Bucket commands can also omit --credentials in interactive flows and still send the resolved credential name to the bucket API.

anycloud baked list uses the same credential selection behavior. Its default table now includes a Cloud column, while both --by-region and --json remain available.

0.1.35

`anycloud baked` manages baked VM images

A new anycloud baked command group manages the prebaked VM images that --bake creates:

anycloud baked list shows the baked images discovered in a selected named credential/account, grouped by the container image they were baked from.
anycloud baked copy (AWS) copies a baked image to other regions so deployments there boot warm without re-baking. Copies are idempotent, and --wait blocks until each copy is ready.
anycloud baked prune removes baked images you no longer need from a selected named credential/account, with --unused-for to keep recently-used ones and --region, --image, and --digest filters to scope the sweep. This replaces anycloud images prune.

`anycloud list` filters by credential

anycloud list --provider <cloud> has been removed. Use anycloud list --credential <name> to narrow deployment history to a specific saved credential. The filter matches both deployments pinned to that credential and unpinned deployments whose VMs were later provisioned with it.

Quota commands require an explicit credential

anycloud quota request and anycloud quota status now operate against one named credential at a time. Interactive runs still pick or auto-select a matching credential, but scripted runs should pass --credential <name> so quota reads and requests cannot accidentally scan the wrong cloud account.

`anycloud status --verbose` shows VM diagnostics

anycloud status --verbose now includes container exit details, container logs, and bucket-sync diagnostics when a VM health check finds a problem. The default status view stays concise and points you to --verbose when logs are available. Read-only status checks also no longer append noisy same-state deployment events or API log errors just because a user inspected a deployment.

Clearer GPU runtime handling

AnyCloud now resolves the runtime a GPU VM needs as CUDA, ROCm, or unsupported. NVIDIA CUDA VM-host targets still default Docker GPU access to --gpus all, but AMD/ROCm and other unsupported accelerator targets now fail clearly instead of silently booting a CUDA base image. The GPU docs were updated to match the automatic GPU access behavior.

Hosted API owner checks run before lookups

Hosted API requests now enforce the owner policy before named credential, secret, catalog, or image lookups. Denied callers receive the authorization failure without being able to probe saved server state first.

0.1.34

Local API Docker access is opt-in

anycloud api start binds the API port to 127.0.0.1 and mounts the host Docker socket only when explicitly started with --enable-local-docker. To run anycloud submit --local, restart the local API with anycloud api start --enable-local-docker; that opt-in is preserved by anycloud update.

`anycloud ssh` restored

anycloud ssh <id> is available again for VM-host jobs and servers. It opens an interactive shell in the workload container by default, with --vm for host VM diagnostics and --snapshot for exited-container inspection. SSH keys are not returned from status; the CLI requests explicit, audited SSH access from the API when the command runs.

Configurable `anycloud serve` listen port

Server deployments still default to PORT=8088, but you can now override it with --env PORT=<port> (CLI) or env={"PORT": "<port>"} (Python SDK). The public URL and HTTPS routing are unchanged.

`anycloud serve` always runs servers on-demand

Server deployments (anycloud serve and anycloud api serve) no longer ask you to choose between on-demand and spot pricing when you pick a spot-capable VM type — they always run on-demand, so picking a target is one fewer prompt. Passing --spot to a server deployment is now rejected with a clear error instead of being applied.

Interactive target picker shows disk sizes

When you pick a compute target interactively for anycloud submit or anycloud serve, each option now lists the instance's local disk size alongside its vCPU, memory, and price. The picker aligns every option into stable columns, so machine type, compute, CPU, memory, disk, and rate line up down the list and are easier to scan and compare.

`anycloud status` shows spot and disk size

anycloud status now lists the spot setting and disk size as their own rows in a deployment's summary. Previously a spot deployment was only hinted at by a small marker next to the VM type, and the disk size was not shown at all — so you can now confirm both choices without digging through your deploy configuration.

`anycloud status` shows why an image pull failed

When a VM-side Docker image pull fails, anycloud status now surfaces the underlying Docker daemon output — for example manifest unknown, or an authentication or disk-space error — in a "Docker output" block beneath the concise failure line, instead of a bare "Failed to pull image". This applies to both remote-VM and direct pulls, and to failures recorded earlier, so you can tell why a pull failed without digging into the VM.

`--bake` accepts single-arch and digest-pinned amd64 images

anycloud submit --bake and anycloud serve --bake now work with more kinds of images. Previously --bake could only resolve a linux/amd64 digest from a multi-arch image; it now also handles plain single-manifest linux/amd64 images and images referenced by digest (image@sha256:...), inspecting the image's config to confirm it really targets linux/amd64 before baking. When an image genuinely has no linux/amd64 build, the rejection message is now clearer about what to do.

`anycloud exec` captures complete output more reliably

anycloud exec no longer drops trailing standard error: it waits for all buffered error output to flush before finishing, so the last lines of a failing command's diagnostics are reported instead of being cut off, and a slow output consumer no longer leaves the stream hanging. The Python SDK's Job.exec() now keeps streamed output as raw bytes and decodes it once at the end, so multibyte characters (emoji, accented or non-Latin text) that land on a chunk boundary are no longer corrupted.

Faster failures and cleaner failover for GPU and Lambda serve

A GPU deployment whose container image needs a newer driver than the host provides now fails fast with the underlying driver-mismatch message, instead of silently retrying the same doomed target until the launch budget runs out. On Lambda, hitting your account instance limit is now reported as a quota issue (so the CLI points you at anycloud quota) rather than rotating through regions that can't help, and tearing down a Lambda server that failed over across regions now removes the firewall rules it created in every region, not just the last one.

0.1.33

`anycloud serve` on Lambda

Server deployments can now run on Lambda, alongside AWS, GCP, and Azure, with the same public https://<id>.anycloud.sh URL and HTTPS routing as the other providers.

`anycloud serve --bake`

Server deployments can now opt into baked VM images. Serve uses the same server-resolved image digest as jobs, so a baked image created by a job can be reused by a server, and a baked image created by a server can be reused by a job when the image digest, cloud account, region, and base image family match.

Hosted API restricted to its owner

A hosted AnyCloud API (anycloud api serve) now serves only the account that deployed it and rejects requests from anyone else, instead of being reachable by any caller who has the URL.

0.1.32

`anycloud api serve` deploys a hosted AnyCloud API

Runs the AnyCloud API as a long-running server on your own cloud compute instead of locally, printing the public URL and API_URL once it's healthy (--no-wait to skip the wait). anycloud api start now refuses to start when a database exists without its key, instead of generating a mismatched one.

Python SDK `get_or_submit()` for restart-safe workflows

A restarted workflow step reattaches to the deployment it already submitted instead of duplicating it, using a stable ID derived from the workflow id, step name, and submission spec.

`anycloud quota status` shows new requests immediately

A request you just made with anycloud quota request now appears right away with its state, instead of disappearing until the provider's own history catches up.

`anycloud serve` origins locked to Cloudflare

Server HTTP ports are now firewalled to Cloudflare's edge ranges instead of open to the whole internet; existing servers are tightened on their next firewall reconcile.

`anycloud serve` deployments on Azure are reachable again

anycloud serve runs on any supported provider. On Azure specifically, some HTTP requests to a server's https://<id>.anycloud.sh URL could be dropped before reaching it, so the deployment looked unreachable even when healthy — now fixed for both IPv4 and IPv6.

GCP skips regions that can't see the requested machine type

Unpinned GCP deployments drop candidate regions where the requested machine type isn't visible before provisioning, so they fail over to a working region faster.

Azure terminations clean up leftover network resources more reliably

If an Azure termination fails partway through, the deployment's network interface and public IP are now marked for cleanup up front, so the next cleanup pass reclaims them rather than leaving them behind in your subscription.

0.1.31

CLI installs now use public release assets and Homebrew

The anycloud CLI now ships as platform tarballs in the public anycloud-sh/releases repo instead of through npm. The installer downloads the matching tarball and verifies it against checksums.txt before installing.

macOS and Linuxbrew users can install with brew install anycloud-sh/tap/anycloud and upgrade with brew upgrade anycloud-sh/tap/anycloud. anycloud update now checks GitHub Releases for newer versions and points Homebrew-managed installs back to Homebrew instead of overwriting them.

`anycloud quota request` GPU picker starts empty

The interactive anycloud quota request GPU picker no longer preselects curated GPU instance types. Curated options still appear first, but users now explicitly choose every GPU VM type they want to request.

`anycloud status` can watch a deployment until it finishes

anycloud status <id> --watch now polls and redraws the status view until the deployment reaches a terminal state. Use --interval <seconds> to adjust the poll cadence. For scripts, anycloud status <id> --watch --json emits one compact status object per line.

0.1.30

`anycloud list` gets a time window and a result limit; `-a/--all` is removed

anycloud list now shows your most recent deployments by default, newest first, instead of only the last 24 hours — so a bare anycloud list no longer comes up empty when your latest work is a few days old. Two new flags adjust the view, on both list and the status picker:

-p, --period <window> — only deployments started within a window, e.g. anycloud list -p 7d or -p 24h.
-n, --limit <count> — how many to show, newest first.

The -a, --all flag has been removed. To reach older deployments, widen the window with -p or raise the count with -n. Every list query is now bounded, so listing stays responsive on accounts with long histories.

The --status, --provider, and -f/--filter filters now search your full history rather than only the page being shown, so filtering for an older deployment no longer comes up empty when it falls outside the recent window.

`anycloud cost --period all` is removed

anycloud cost no longer accepts --period all. Cost is always computed over a recent window of deployments rather than your entire history, so "all-time" was misleading. Pass an explicit window instead — e.g. anycloud cost --period 90d (the default is still 30d).

Azure bucket sync no longer loses storage access after cleanup

On Azure, deployments in a subscription share a single identity that grants their VMs read/write access to bucket storage. Routine cleanup of orphaned resources could mistakenly remove that shared access, after which bucket sync failed with permission errors on every running and new deployment until the API was restarted. Cleanup now preserves the shared access grant, and each deployment restores it automatically if it is ever missing — so bucket sync recovers on its own.

Invalid cloud credentials are flagged instead of failing on every deployment

When a saved cloud credential starts failing authentication — for example after its secret is rotated or expires — anycloud now marks it invalid instead of retrying the same failing call on every deployment and cleanup pass. New work that would use it is skipped, and cleanup for affected deployments pauses (so it stops re-hitting the provider and flooding logs) until the credential is fixed. anycloud credentials list shows which credential is failing and since when, and with notifications enabled anycloud sends a one-time alert so a broken credential surfaces instead of failing silently. Re-entering a valid secret clears the flag; affected deployments can then be resubmitted.

0.1.29

`anycloud list` stays responsive on accounts with long histories

The list and cost commands now select the deployments to display first, then total their VM usage, cost, and latest status only for those — instead of aggregating across every deployment an account has ever created and trimming afterward. Accounts with thousands of past deployments no longer pay whole-history cost on each list. Results are unchanged except that deployments sharing an exact start time now sort in a stable order.

Cleanup fails fast on broken cloud credentials

When a cloud credential is rotated, expired, or missing required permissions, resource teardown now gives up immediately instead of retrying the same failing delete for several minutes, and Azure requests that would otherwise hang now time out. This clears stuck cleanup work sooner; a genuinely broken credential still needs to be corrected before its leftover resources can be removed.

API containers use vanilla Node memory defaults again

The API Docker image no longer starts through a shell entrypoint that auto-sizes V8 heap memory from the host or container memory limit. Local API containers now use Node.js's default memory behavior unless operators explicitly provide NODE_OPTIONS.

0.1.28

Large queued batches no longer overwhelm the API server

Before dispatching a queued deployment, the server checks cloud quota and availability across candidate regions. Each queued deployment used to run that check independently, so a deep queue of similar jobs could issue thousands of simultaneous cloud API calls every few seconds — enough to starve the server and slow every command. Identical checks are now shared: a batch of similar deployments performs one sweep together, and a failed sweep logs a single warning instead of flooding the logs. Region filtering behavior is unchanged.

0.1.27

Releases now ship as one forward-only public line

The release flow no longer has separate latest and stable channels. Each release publishes the matching API image before publishing public CLI assets, then publishes the Python SDK and agent skill from the same version-bump commit. This prevents an updated CLI from pointing at an API image that has not been published yet.

anycloud update now moves forward to the current public release and restarts a running local API server automatically. Version arguments, channel selectors, and --no-restart have been removed so CLI/API installs stay in lockstep. Local API databases also record the newest app version that touched them, and older API containers refuse to start against newer forward-migrated data.

0.1.26

0.1.25

The local API server restarts itself after crashes and reboots

anycloud api start now runs the server with Docker's unless-stopped restart policy: if the process crashes or the machine reboots, the API comes back on its own instead of staying down until someone notices — while jobs run unsupervised, a dead API means completed VMs keep billing. An explicit anycloud api stop still keeps it stopped.

Crashes also stay diagnosable now: the server's recent logs are saved to ~/.anycloud/api-last.log before the container is replaced or removed (they were previously destroyed with it), container logs rotate instead of growing without bound, and anycloud api status shows a restart count when the server has been crashing.

Deployment listings no longer include secrets

The deployment list API response included each deployment's decrypted SSH private key, environment variables, and GitHub token — none of which anycloud list ever displayed. Listings are now redacted. Flows that genuinely need the key, like anycloud exec, are unaffected: they fetch it through the single-deployment status endpoint.

The API server stays responsive with a large deployment history

A background check that runs every minute used to scan the server's entire deployment history; once that history reached hundreds of thousands of records (large, long-running fleets) it could stall the server for seconds at a time, slowing every command. The scan is now indexed — about 60× faster on a real customer database — so anycloud list and anycloud status stay snappy during big runs.

The API server uses your machine's memory instead of a fixed ~4GB cap

The server previously ran with Node.js's default heap ceiling (~4GB) no matter how much memory the machine had, which could crash it under heavy fleet load even on large machines. It now sizes its heap to 75% of available memory.

`anycloud status` separates billable time from wall-clock duration, and shows cost

The status header used to show a single Duration measured from submission to completion — including time spent queued or retrying before any VM existed. It now also shows Billable, the time VMs actually existed (summed across spot replacements) — the time your cloud charges you for — and a Cost row with the deployment's spend and hourly rate, marked "(est.)" while it's estimated from catalog prices rather than reconciled against your cloud bill. All three update live for running deployments.

0.1.24

Fixed: image validation no longer depends on local Docker

anycloud submit used to verify your image with a local Docker pull, which could wrongly reject a valid image on Apple Silicon Macs (a "no matching manifest for linux/arm64/v8" error) and depended on tools not present on a default macOS install. Image validation now happens server-side, so it behaves the same from the CLI, the Python SDK, and CI, and no longer needs Docker on your machine.

Clearer errors for bad images, before your job starts

Before provisioning, anycloud now checks your image and fails fast with an actionable message when it has no linux/amd64 build (anycloud VMs run x86_64 — with docker buildx build --platform linux/amd64 … --push guidance), doesn't exist, or isn't accessible — instead of failing partway through the deployment.

`anycloud update` restarts a running API server

If a local API server is running when you run anycloud update, it now offers to restart it on the new version (on the same port) so you pick up the update right away. At a terminal you're asked to confirm (default yes); in non-interactive contexts it restarts automatically. Previously you had to remember to run anycloud api stop and anycloud api start yourself, or the server kept serving the old version.

`anycloud submit` is now fully interactive

Running anycloud submit in a terminal without the cloud flags used to exit with a "Cloud config required" error. It now guides you through the whole job: it prompts for credentials (auto-selecting when you only have one), then for a compute type from a searchable list of your cloud's instances — GPU options first, then CPU instances labeled with vCPUs, memory, and price — and, when the instance supports it, whether to run on-demand or spot. Pass --credentials, --vm-type, --gpu-type, or --spot to skip the matching prompt.

Fixed: clearer error when requesting spot on a cloud that doesn't support it

Submitting a spot job to a provider without spot instances now returns a clear client error instead of a generic server error.

0.1.23

Fixed: slow failover when a region is out of capacity

When a region had no capacity for the requested instance type, a deployment could spend several minutes on each retry before trying another region. Deployments now fail over to the next region with availability promptly.

Fixed: deployments could get stuck while downloading the image

If the SSH connection dropped while the container image was being pulled, a deployment could wedge in the downloading state. Image pulls now run detached on the VM, so a dropped connection no longer stalls progress.

0.1.22

Fixed: CLI failed to start on fresh installs

A packaging bug in 0.1.21 caused the anycloud CLI to exit immediately on startup with a Cannot find module error. The CLI now starts normally. If you're on 0.1.21, reinstall with:

curl -fsSL https://get.anycloud.sh | bash

0.1.21

`anycloud ssh` and `anycloud logs` removed

The anycloud ssh and anycloud logs commands have been removed. Use anycloud exec <id> "<command>" to run a command in your job's execution environment, and anycloud status <id> --verbose for lifecycle, errors, and captured output. On providers that run your image directly as the workload, anycloud exec now runs your command in that environment rather than wrapping it in a container exec.

Vast spot (interruptible) instances

anycloud submit --spot now works on Vast. Spot runs bid on Vast's interruptible marketplace at the live offer price, and anycloud detects when an instance is reclaimed and reprovisions a fresh one to keep the job going.

Vast hosts have no storage of their own, so spot on Vast requires a checkpoint bucket on a cloud that does. Point it at storage credentials with the new flags:

anycloud submit my-image --spot
--checkpoint-storage-credentials my-aws
--checkpoint-storage-region us-east-1

Checkpoints your job writes sync to that bucket, so a reprovisioned instance restarts with them already in place. The bucket is named after the deployment and is removed when the deployment ends — keep it with --persist-bucket.

Vast also now appears in anycloud gpus and anycloud pricing, including the cross-cloud anycloud gpus --type comparison.

Bring your own Docker image

anycloud build, Client.build(), and the Python anycloud.Image builder API have been removed. Build and push Docker images locally or in CI, then submit the resulting image reference with anycloud submit or Client.submit(). See Docker — Building and Pushing for local and GitHub Actions examples.

anycloud login now also attempts a best-effort local docker login ghcr.io when Docker is installed, using the same GitHub token, so GHCR pushes work without setting up a separate registry credential.

Explicit image baking and pruning

anycloud submit --bake and Client.submit(..., bake=True) now explicitly opt in to creating baked VM images. Normal submits can still reuse a matching baked image when one exists, but they no longer create new baked images by default. Use anycloud images prune to remove stale baked images from cloud accounts.

Stricter git source checks for `@anycloud.function()`

The Python function decorator now rejects dirty working trees and commits that are not pushed to origin before submission. Remote execution clones from GitHub at the submitted commit, so local-only changes cannot be reproduced on the VM.

Agent session labels use Claude Code titles

Agent-scoped CLI output now shows the readable Claude Code session title when one is available, instead of only showing the raw session identifier in list and spend-control views.

Interactive credential setup improvements

anycloud credentials new now better guides AWS and Azure setup from local CLI state, including browser login paths, profile selection, and clearer cleanup when deleting credentials.

`anycloud quota request` now raises Azure public-IP limits

Azure caps public IP addresses per subscription per region (100 by default), and every VM consumes one — so a fleet could stall on the IP cap even with plenty of vCPU quota. Two changes: anycloud quota request now files a public-IP increase alongside the vCPU request for each region, and hitting the public-IP cap is now treated as a quota error (the region backs off and the anycloud quota hint is shown) instead of being retried indefinitely.

Guided `anycloud quota request` picker

Running anycloud quota request with no arguments in a terminal now walks through credential → GPU quota target (choose specific types or all GPU families) → GPU instance type(s) when needed (searchable multi-select, with Ctrl-A to toggle all) → quota type(s) (on-demand, spot, or both) → region (or all regions), filing a request per selected type. Passing --spot or --region skips the matching prompt; explicit vmType/--gpu invocations and non-TTY/--json usage are unchanged.

0.1.20

Vast.ai provider support

Vast.ai is now available as a GPU marketplace provider. Add a Vast credential with anycloud credentials new, submit jobs against Vast GPU offers, and use the normal logs, exec, ssh, and bucket-sync flows through Vast's NAT'd SSH endpoints.

Config profiles removed in favor of inline flags

anycloud config and ANYCLOUD_CONFIG profiles have been removed. Pass submit settings directly with flags such as --credentials, --gpu-type, --vm-type, --region, and bucket options. Existing docs and tutorials now use inline submit flags.

`anycloud images` browses GHCR images

anycloud images now lists and filters images from GitHub Container Registry, with interactive selection in a TTY and --json / --only-refs output for scripts. The build flow also gained a credential picker so image builds can target the intended cloud account.

AWS, Azure, and prebake reliability

AWS provisioning now handles multi-VPC security group selection correctly and avoids default-VPC security group APIs. Azure capacity failures rotate more cleanly, and prebake image lookup now keys on image digest so reused tags resolve to the right baked image.

0.1.19

Agent-scoped list, cost, and spend controls

anycloud list and anycloud cost now scope to the detected agent session when appropriate, matching the per-session throttle and budget controls. Spend-cap block alerts are debounced so queued deployments do not spam repeated notifications while waiting for capacity or budget.

Quota and capacity retry routing

Quota failures are now classified by region versus account scope, so the optimizer blocks the right retry target instead of over- or under-rotating. AWS provisioning also handles shared IAM bootstrap retries and non-default networking paths more reliably.

Cross-cloud bucket and checkpoint credential fixes

Bucket sync, checkpoint restore, and cross-cloud storage credentials now use the credential scoped for the actual storage operation. This avoids leaking AWS S3 credentials across provider boundaries and makes cross-cloud checkpoint flows more predictable.

0.1.18

Auto-retry jobs whose container can't see the GPU

If a GPU job's container exits with CUDA_ERROR_NO_DEVICE in its logs — the host booted without visible GPUs — the deployment now retries on a fresh VM instead of being marked errored. The previous behavior required a manual anycloud resubmit for every occurrence. The retry uses the standard failed-deploy backoff and exhaustion limits.

Notifications digest skips the first-run backfill

The first time the digest monitor runs after enabling notifications on a host, it marks today as the starting point and posts nothing. Previously the next 15-minute tick would post a half-empty digest of whatever happened to be in the DB before notifications were configured. Your first real digest now lands the morning after your first full UTC day with notifications enabled.

Interactive prompts for `budget`, `throttle`, `notifications`

Running anycloud budget set, anycloud throttle set, or anycloud notifications enable slack with no arguments now prompts for the missing values (cap, window, webhook URL) when stdin is a terminal. Flag-only invocations behave the same as before, and piped or CI usage (no TTY, or CI=true) still exits with the previous error instead of hanging.

0.1.17

`status` now returns the full VM history (breaking)

/v1/status and anycloud status --json now return vms (every VM ever associated with the deployment, terminated rows included) instead of vmStatuses (active only). Each entry has a new endedAt field — null for live VMs, a millisecond timestamp for terminated ones. Live VMs still carry the container and rclone health-check fields; terminated rows are provisioning metadata only. This lets you read prebakedImageId, region, and cloudId for VMs after they've been cleaned up — previously those were only visible during the deployment's running window.

The human-formatted anycloud status output is unchanged by default (terminated VMs are hidden). Pass --verbose to see them, marked (terminated). The Python SDK's status_response.vm_statuses is now status_response.vms and includes terminated rows; SDK helpers like Job.exec() and Job.logs() filter to live VMs automatically.

If you have scripts that grep for vmStatuses in status --json output, rename to vms and add select(.endedAt == null) if you want the old "active VMs only" semantics.

Spend controls: `anycloud throttle` and `anycloud budget`

Two new commands cap spend by blocking new dispatches when usage exceeds a limit. Running jobs are never killed — only queued work pauses.

anycloud throttle set 20 — burn-rate cap ($/hr across in-flight VMs, plus the candidate). Set --agent-session to cap each agent run independently instead of account-wide.
anycloud budget set 100 --per day — calendar-window cap (day, week, or month). Same --agent-session toggle.

Blocked deployments stay queued with a reason shown inline in anycloud list and anycloud status, and auto-dispatch once running VMs end (throttle), the calendar resets (budget), or you raise the cap. See Spend Controls for the full surface.

`anycloud db query` / `db schema` for read-only DB inspection

Two new commands let you (or an agent) inspect the local API database without leaving the CLI. anycloud db query "<sql>" runs a read-only SELECT / WITH / EXPLAIN / PRAGMA — writes are refused at the SQLite engine level — and emits rows as JSON with --json. anycloud db schema [table] introspects tables, columns, foreign keys, and indexes; pair the JSON form with db query to answer "which deployments are stuck and why" without poking the SQLite file directly.

`anycloud cost` switches to real billed cost

anycloud cost <id> now shows the real cost pulled from your cloud provider's billing API once the bill settles (~24 hours after a deployment ends). Until then — or if your account doesn't have billing-read permissions — it falls back to the catalog estimate as before, marked (est.).

To enable this, the cloud credentials anycloud uses now need a small additional permission:

AWS: add ce:GetCostAndUsage to the IAM policy. Activate the anycloud_deployment cost-allocation tag once (Billing → Cost Allocation Tags → activate, or aws ce update-cost-allocation-tags-status --cost-allocation-tags-status TagKey=anycloud_deployment,Status=Active).
Azure: grant the service principal the built-in Cost Management Reader role on the subscription.
GCP: enable Cloud Billing BigQuery export with resource-level detail, grant the service account roles/bigquery.dataViewer on the billing dataset, and pass the dataset to anycloud setup gcp.

Without these, nothing breaks — anycloud cost keeps showing the estimate.

`anycloud notifications enable slack` — daily Slack digest of usage

The API container can now post a once-a-day summary to a Slack channel you choose, aggregated from your local deployment history (~/.anycloud/api.db) and sent directly from your machine to the webhook.

anycloud notifications enable slack --webhook https://hooks.slack.com/... posts a test message before saving so a typo'd webhook is caught immediately. The digest covers total spend, count by terminal state, preemption rate, median runtime, and active users. anycloud notifications status / test slack / disable slack round out the surface. Setup details in docs/guides/notifications.md.

0.1.16

MCP secrets tools

Three new MCP tools — secrets_list, secrets_create, secrets_delete — let an agent manage named secrets (env-var bundles used via secrets=[...] on submit). Mirrors the existing CLI secrets new/list/delete. secrets_delete accepts force=True to override the safeguard that blocks delete while non-terminal deployments still reference the secret.

CLI catalog introspection

Four new commands surface the same cloud catalog the Python SDK already exposed, so you can pick a region or VM type before you submit:

anycloud regions <provider> [--vm-type <type>] [--spot] — list available regions. With --vm-type, narrows to regions that offer that VM type, sorted cheapest first.
anycloud vm-types <provider> <region> [--accelerator <gpu>] — list VM types in a region, optionally filtered to ones with a specific accelerator.
anycloud gpus <provider> [--type <gpu>] — list available GPUs, or available counts for a specific GPU when --type is given.
anycloud pricing <provider> <vm-type> [--region <r>] [--spot] — on-demand or spot price per region.

All four accept --json for scripting and accept the provider name case-insensitively (aws, AWS, Aws all work).

`anycloud bucket upload` / `download`

New CLI commands stream files in and out of an S3, GCS, or Azure Blob bucket using a credential you've already registered with anycloud credentials new:

anycloud bucket upload my-bucket ./data.bin data/in.bin --credentials prod-aws
anycloud bucket download my-bucket data/out.bin ./out.bin --credentials prod-aws --region us-east-1

The same routes back the Python SDK's Bucket.upload() / Bucket.download(), so the CLI and SDK now go through one code path. As part of that switch, the Python SDK no longer needs cloud filesystem libraries — the [aws], [gcp], and [azure] extras have been removed from anycloud-sdk because s3fs, gcsfs, and adlfs are no longer used. Bucket I/O now requires the credential to be registered with anycloud credentials new (it could previously work with inline CloudConfig credentials passed to Client(...)); construct your client with Client(credentials="<name>") instead.

`anycloud cost --json` for scripts and agents

anycloud cost now accepts --json. Without an id, it emits the aggregate report { periodMs, totalCost, totalDurationMs, totalJobs, deploymentsWithoutPrice, byProvider }; with an id, it emits the per-deployment fields (id, image, cloudProvider, region, vmType, spot, totalCost, durationMs, ratePerHour). Previously the only output was a formatted table, so anything calling anycloud cost from a script had to grep currency strings out of stdout.

`anycloud list` gains exact-match filters and machine-friendly output modes

anycloud list now supports --status <state>, --provider <cloud>, --csv, and --only-ids in addition to the existing substring -f, --filter and --json. Multiple predicates AND together. --csv emits a header row plus one row per deployment (id,state,cloudProvider,region,vmType,image,spot,startedAt,completedAt,totalCost,deploymentType), useful for piping into spreadsheets or awk. --only-ids emits one deployment ID per line and skips the table, useful for anycloud list --only-ids --status running | xargs anycloud terminate and similar batch operations.

`--json` output is now compact

Every --json flag in the CLI (list, status, cost, credentials list, config list, quota) now emits a single line of compact JSON instead of pretty-printed JSON with two-space indents. Equivalent payload, ~30% fewer bytes. jq . and any other JSON parser handle it identically.

0.1.15

Failed provisioning attempts no longer orphan cloud resources or pollute `anycloud list` / `cost`

When a VM provisioning attempt failed partway through (for example, Azure created the NIC and Public IP but the VM create itself errored), the rows were marked "ended" in anycloud's bookkeeping even though the cloud-side resources were still there — slowly eating the 100-Public-IP-per-region Azure quota until new submits started failing on quota errors. The same rows also showed up as live VMs in anycloud list and accrued duration in anycloud cost. Cleanup now waits for the cloud to confirm each resource is gone before marking it ended (the cleanup monitor keeps retrying anything the cloud rejected), and failed-attempt rows are filtered out of the list and cost aggregates.

`status` always renders the deployment header

anycloud status <id> no longer falls back to ❓ unknown with a "deployment not found in deployments list" hint when the id is older than the picker window or otherwise missing from the recent list. The metadata (image, cloud, region, VM type, age) now comes back with the status response itself, so the header is correct on the first call and the second round-trip is gone.

Security: deployment IDs are no longer probeable across users

Knowing another user's deployment id no longer lets you fetch their SSH key (status), terminate their job (terminate), or resubmit it (resubmit). All three now reject cross-user ids with the same "deployment not found" response they return for ids that genuinely don't exist, so an authenticated caller can't enumerate other users' deployments either. Pre-existing pings against your own deployments are unchanged.

0.1.14

`anycloud cost` now works for jobs submitted without a pinned region

anycloud cost reported $0.00 for jobs submitted without an explicit region — the common case, since omitting a region enables multi-region failover. The hourly rate is now captured on each VM at provision time (when the region is finally known) and cost is summed across the deployment's VMs, instead of relying on a single rate stored at submit. Multi-region retries that land in different-priced regions are now reflected accurately in the total.

0.1.13

Multi-region Azure submits stop retrying regions the subscription can't use

When Azure rejects a region with a "not permitted for subscription" or "not available for resource group" error (e.g. Jio India regions on most commercial subscriptions), the deployment now skips that region for the rest of its lifetime instead of cycling back to it every few minutes. Previously, multi-region jobs could exhaust their retry budget bouncing between forbidden regions and never reach the regions the subscription actually has access to.

0.1.12

`--persist-bucket` retains the spot checkpoint bucket without keeping the VM

anycloud submit --spot --persist-bucket skips the auto-deletion of the checkpoint bucket on job completion. Independent of --persist — the common shape is an ephemeral VM (no --persist) plus a retained bucket, so a later anycloud resubmit <id> brings a fresh VM up against the same checkpoint contents without paying for an idle one in between. The flag is a no-op (with a warning) on non-spot deployments since they have no checkpoint bucket. Cleanup is manual: aws s3 rb s3://<id> --force or the equivalent for your provider — the existing orphan reaper leaves retained buckets alone because they aren't tagged as ended.

0.1.11

Legacy credential arrays

Earlier builds briefly accepted a list of named compute credentials. That shape has been superseded by pinned-or-unpinned compute credentials: pass one --credentials <name> to pin compute, or omit it so anycloud chooses from saved named credentials.

Legacy array rows are normalized to unpinned compute during migration.

0.1.10

Ordered fallback pools for `vmType` and `gpuType`

Both fields now accept an array as well as a single string. The provisioning scheduler expands every entry into the candidate pool — for gpuType, each alias is resolved into all interchangeable instance types — and falls over in order when the primary is quota-blocked or unavailable. CLI flags --vm-type / --gpu-type are now repeatable on submit, config new, and config edit; anycloud.yaml profiles accept a list under either field. The config new wizard now uses a multi-select picker (Space to mark, Enter to confirm — selection order is fallback order). The two fields are still mutually exclusive (only one of vmType / gpuType may be set, but each can itself be a list). The original list is preserved on the deployment row for display and retries.

`quota request` always asks for one more instance of headroom

anycloud quota <vmType> now files a ticket every time you run it (unless an in-flight request is already pending for the same family + region). The previous "only when exhausted" trigger missed the common fresh-subscription case — limit=0, usage=0 looked fine but the user couldn't launch a single VM. The new target is max(currentUsage + vCPUsPerVM, currentLimit × 2, vCPUsPerVM) — at minimum, one more instance's worth of capacity, with a doubling floor on top so re-running keeps growing the ceiling. AWS applies the same rule with currentUsage treated as 0 (Service Quotas doesn't expose live usage).

If every surveyed region already has a pending ticket for the family, the tail reads No new tickets filed — every surveyed region already has a pending quota request for <vmType> on <cloud>. instead of No actions taken..

`quota` picks credentials, not clouds

quota request and quota status now prompt for a credential rather than a cloud — credential implies cloud, so the second picker round-trip is gone. With a single AWS or Azure credential the picker is silent; with multiple, an autocomplete labelled <name> (<cloud>) appears. --credential <name> still bypasses the picker. Credentials for clouds without quota support (GCP, Lambda, Local) are filtered out and surface a clear error instead of running the picker only to fail later.

Breaking: `--cloud` removed from `quota` subcommands

anycloud quota --cloud <aws|azure> no longer accepts the flag on either request or status. Cloud is now derived from the resolved credential or, for request, the vmType prefix. Pass --credential <name> to pin to a specific account.

0.1.9

Picker fixups

Restore -a/--all on status (regressed when the picker came back) so the picker can widen past the last 24h.
--json now refuses the picker rather than dumping its stdout chrome into the JSON pipe.

Multi-region `quota request` / `quota status`

--region is now optional on both subcommands. Omit it to fan out across every region in the cloud's catalog (AWS narrows to regions where the vmType is offered; Azure narrows to regions where the family is exhausted). Per-region calls run in parallel via Promise.allSettled, so wall-clock matches a single-region invocation. Re-runs stay idempotent thanks to the existing per-region dedup.

When the fan-out hits an AWS region that is disabled-by-default (e.g. me-central-1, ap-east-1, af-south-1), anycloud submits the AWS EnableRegion call for it. Opt-in is asynchronous — the quota request for that region is deferred and the entry comes back as SUBMITTED with optInAction: enabling. Pass --region if you'd rather not opt anything new in.

Interactive GPU instance picker

Run anycloud quota request with no vmType and no --gpu in a TTY to get an interactive cloud picker followed by an autocomplete list of GPU instance types from the catalog (e.g. g6e.48xlarge — 8× L40S (192 vCPUs)). Non-TTY / --json flows are unchanged: an explicit vmType or --gpu is still required.

0.1.8

Smarter blocks for disabled regions

When a provision fails because the region's quota is 0 or the subscription can't deploy there at all (AWS OptInRequired, Azure ProvisioningDisabled / LocationIsOfferRestricted), the optimizer blocks that region for 6 hours instead of the 30-minute generic quota block — so retries stop hammering regions that won't come back without a support ticket.

Named Secrets

New primitive for env vars you don't want visible in API responses. anycloud secrets new|list|delete manages named bundles that are encrypted at rest, never returned by the API, and injected into containers at run time via --secret <name> (CLI) or secrets=[...] (Python SDK). Update a secret and resubmit to rotate — no need to edit every call site. See the Secrets guide.

Sensitivity warning on `--env`

When you pass a likely-sensitive key via --env (e.g. HF_TOKEN, API_KEY, anything matching token / secret / password / key), the CLI prints a one-line warning pointing at anycloud secrets new. The submit still proceeds; the warning is advisory.

Version warning

The CLI checks its own version against the current public release and prints a warning when you're behind. Keeps users from debugging issues that were already fixed upstream.

Quota dashboard links

anycloud quota status now surfaces structural console URLs (AWS Service Quotas, Azure portal) for each case, so you can jump straight to the provider's UI.

Smarter `quota request`

anycloud quota request detects partial grants and skips re-requesting quota that's already been submitted or granted, so re-runs are idempotent.

Broader Azure Region Coverage

Azure catalog now pulls the full list of physical regions via the subscription-wide /locations endpoint instead of per-SKU locationInfo, which was silently gated on per-SKU enrollment. Measured on one subscription: 43 → 58 regions, ~33k → ~55k SKU rows.

`--json` on read-shaped commands

status, list, config list, and credentials list now accept --json. Chrome output routes to stderr; stdout is the pure JSON payload, so anycloud list --json | jq just works. credentials list --json emits only {name, cloudProvider} — secret fields never cross the boundary. Empty results return [] instead of the prior "No X found" strings.

Interactive deployment picker (TTY only)

Deployment ID is now optional on status, logs, exec, ssh, terminate, and resubmit: omit it in a TTY for an interactive picker (multi-select on terminate and resubmit). This walks back part of the "Non-Interactive CLI" change from 0.1.7 — non-TTY/CI still requires an explicit ID or ANYCLOUD_DEPLOYMENT_ID, so scripts and agents are unaffected.

Azure-safe deployment ID cap

Max deployment ID length lowered 40 → 28 to leave room for the -${index} suffix Azure appends to VM/NIC/public-IP names. Auto-generated IDs also tightened so they fit under the new cap.

0.1.7

Batch Submission in the Python SDK

Submit many jobs in parallel from a single call, with first-class partial-failure handling. The decorator path gets a map helper for fan-out workloads.

More Resilient Prebakes

Cached images survive upstream base-image rotation, and bake outcomes now show up in deployment status instead of disappearing into logs. CLI status also surfaces when a deploy hits a prebake cache.

Quota Requests From the CLI

Request and check cloud quota directly from the CLI, with AWS now supported alongside Azure. Deployments that exhaust retries on a quota error point you at the new commands.

Credentials Through the API

The CLI and MCP no longer touch plaintext credentials on disk. New MCP tools cover importing existing cloud credentials and generating least-privilege ones.

Non-Interactive CLI

Every CLI command is now flag- and env-driven — no prompts, no pickers, no wizards. Better for scripts, CI, and agents.

Steadier Region Rotation

Region or SKU mismatches from a cloud no longer fail a deployment outright; the optimizer rotates instead.

0.1.6

More Resilient Image Pulls

Faster retries on transient pull failures and gentler post-provisioning backoff.

Expanded MCP Server

Added list_gpus, list_gpu_counts, credentials_get, bucket_upload, and bucket_download tools.

Smarter Region Rotation

When every target is temporarily blocked, the optimizer rotates across regions instead of stalling on the cheapest one. Block TTLs are also preserved at their maximum, and InvalidTarget blocks are capped at 6 hours.

Strict CloudConfig Validation (Python SDK)

CloudConfig now rejects unknown keyword arguments instead of silently ignoring them.

0.1.5

Per-Bucket Storage Credentials

Input and output buckets can now use different cloud credentials and regions. The shared storageCredentials/storageRegion fields have been replaced with per-bucket equivalents (inputStorageCredentials, outputStorageCredentials, etc.).

0.1.50
- GPU jobs retry a not-ready host instead of failing
- Lambda deployments reach newer regions and fall back across instance types
- Lazy image loading now uses eStargz
0.1.49
- Large images start sooner with SOCI v2
- Azure bucket sync data now avoids temporary VM storage
- Region-pinned deployments retry capacity sooner
- Lambda launch retries clean up duplicate instances
- Workload image baking has been removed
0.1.48
- Preempted spot deployments no longer stay falsely running
- Failing bucket syncs back off between retries
- Bucket sync health now comes from rclone's own errors
- Clearer terminal status for jobs with output buckets
0.1.47
- anycloud update actually restarts the local API server again
- Cross-cloud input buckets no longer get stuck in syncing
0.1.46
- Bake-hit boots wait for the warm-up, and it targets the right files
0.1.45
- Faster initialization on baked-image boots
0.1.44
- Boot-stable warm-up for large baked images
- Target selection
0.1.43
- Baked-image warm-up now covers GPU base images
0.1.42
- Faster container starts on baked-image cache hits
- AWS disk tiers
- gp3 root volumes by default
- Baked-image cache opt-out
- Release hardening
0.1.40
- AWS baked-image startup reliability
- Local Kubernetes release hardening
0.1.39
- Bake reliability
- Quota and status
- Updates and release hardening
0.1.38
- Quota and status internals
0.1.37
- Local Docker-control
- Serve upgrades
- Target selection
- Baked images and registries
- Local API reliability
- Operational hardening
0.1.36
- Choose which API server the CLI talks to
- CLI and API versions must match
- anycloud logs for VM-host deployments
- anycloud list distinguishes jobs and servers
- Credential prompts are more consistent
0.1.35
- anycloud baked manages baked VM images
- anycloud list filters by credential
- Quota commands require an explicit credential
- anycloud status --verbose shows VM diagnostics
- Clearer GPU runtime handling
- Hosted API owner checks run before lookups
0.1.34
- Local API Docker access is opt-in
- anycloud ssh restored
- Configurable anycloud serve listen port
- anycloud serve always runs servers on-demand
- Interactive target picker shows disk sizes
- anycloud status shows spot and disk size
- anycloud status shows why an image pull failed
- --bake accepts single-arch and digest-pinned amd64 images
- anycloud exec captures complete output more reliably
- Faster failures and cleaner failover for GPU and Lambda serve
0.1.33
- anycloud serve on Lambda
- anycloud serve --bake
- Hosted API restricted to its owner
0.1.32
- anycloud api serve deploys a hosted AnyCloud API
- Python SDK get_or_submit() for restart-safe workflows
- anycloud quota status shows new requests immediately
- anycloud serve origins locked to Cloudflare
- anycloud serve deployments on Azure are reachable again
- GCP skips regions that can't see the requested machine type
- Azure terminations clean up leftover network resources more reliably
0.1.31
- CLI installs now use public release assets and Homebrew
- anycloud quota request GPU picker starts empty
- anycloud status can watch a deployment until it finishes
0.1.30
- anycloud list gets a time window and a result limit; -a/--all is removed
- anycloud cost --period all is removed
- Azure bucket sync no longer loses storage access after cleanup
- Invalid cloud credentials are flagged instead of failing on every deployment
0.1.29
- anycloud list stays responsive on accounts with long histories
- Cleanup fails fast on broken cloud credentials
- API containers use vanilla Node memory defaults again
0.1.28
- Large queued batches no longer overwhelm the API server
0.1.27
- Releases now ship as one forward-only public line
0.1.26
- More history reads are bounded to live work or explicit windows
0.1.25
- The local API server restarts itself after crashes and reboots
- Deployment listings no longer include secrets
- The API server stays responsive with a large deployment history
- The API server uses your machine's memory instead of a fixed ~4GB cap
- anycloud status separates billable time from wall-clock duration, and shows cost
0.1.24
- Fixed: image validation no longer depends on local Docker
- Clearer errors for bad images, before your job starts
- anycloud update restarts a running API server
- anycloud submit is now fully interactive
- Fixed: clearer error when requesting spot on a cloud that doesn't support it
0.1.23
- Fixed: slow failover when a region is out of capacity
- Fixed: deployments could get stuck while downloading the image
0.1.22
- Fixed: CLI failed to start on fresh installs
0.1.21
- anycloud ssh and anycloud logs removed
- Vast spot (interruptible) instances
- Bring your own Docker image
- Explicit image baking and pruning
- Stricter git source checks for @anycloud.function()
- Agent session labels use Claude Code titles
- Interactive credential setup improvements
- anycloud quota request now raises Azure public-IP limits
- Guided anycloud quota request picker
0.1.20
- Vast.ai provider support
- Config profiles removed in favor of inline flags
- anycloud images browses GHCR images
- AWS, Azure, and prebake reliability
0.1.19
- Agent-scoped list, cost, and spend controls
- Quota and capacity retry routing
- Cross-cloud bucket and checkpoint credential fixes
0.1.18
- Auto-retry jobs whose container can't see the GPU
- Notifications digest skips the first-run backfill
- Interactive prompts for budget, throttle, notifications
0.1.17
- status now returns the full VM history (breaking)
- Spend controls: anycloud throttle and anycloud budget
- anycloud db query / db schema for read-only DB inspection
- anycloud cost switches to real billed cost
- anycloud notifications enable slack — daily Slack digest of usage
0.1.16
- MCP secrets tools
- CLI catalog introspection
- anycloud bucket upload / download
- anycloud cost --json for scripts and agents
- anycloud list gains exact-match filters and machine-friendly output modes
- --json output is now compact
0.1.15
- Failed provisioning attempts no longer orphan cloud resources or pollute anycloud list / cost
- status always renders the deployment header
- Security: deployment IDs are no longer probeable across users
0.1.14
- anycloud cost now works for jobs submitted without a pinned region
0.1.13
- Multi-region Azure submits stop retrying regions the subscription can't use
0.1.12
- --persist-bucket retains the spot checkpoint bucket without keeping the VM
0.1.11
- Legacy credential arrays
0.1.10
- Ordered fallback pools for vmType and gpuType
- quota request always asks for one more instance of headroom
- quota picks credentials, not clouds
- Breaking: --cloud removed from quota subcommands
0.1.9
- Picker fixups
- Multi-region quota request / quota status
- Interactive GPU instance picker
0.1.8
- Smarter blocks for disabled regions
- Named Secrets
- Sensitivity warning on --env
- Version warning
- Quota dashboard links
- Smarter quota request
- Broader Azure Region Coverage
- --json on read-shaped commands
- Interactive deployment picker (TTY only)
- Azure-safe deployment ID cap
0.1.7
- Batch Submission in the Python SDK
- More Resilient Prebakes
- Quota Requests From the CLI
- Credentials Through the API
- Non-Interactive CLI
- Steadier Region Rotation
0.1.6
- More Resilient Image Pulls
- Expanded MCP Server
- Smarter Region Rotation
- Strict CloudConfig Validation (Python SDK)
0.1.5
- Per-Bucket Storage Credentials

0.1.50​

GPU jobs retry a not-ready host instead of failing​

Lambda deployments reach newer regions and fall back across instance types​

Lazy image loading now uses eStargz​

0.1.49​

Large images start sooner with SOCI v2​

Azure bucket sync data now avoids temporary VM storage​

Region-pinned deployments retry capacity sooner​

Lambda launch retries clean up duplicate instances​

Workload image baking has been removed​

0.1.48​

Preempted spot deployments no longer stay falsely running​

Failing bucket syncs back off between retries​

Bucket sync health now comes from rclone's own errors​

Clearer terminal status for jobs with output buckets​

0.1.47​

anycloud update actually restarts the local API server again​

Cross-cloud input buckets no longer get stuck in syncing​

0.1.46​

Bake-hit boots wait for the warm-up, and it targets the right files​

0.1.45​

Faster initialization on baked-image boots​

0.1.44​

Boot-stable warm-up for large baked images​

Target selection​

0.1.43​

Baked-image warm-up now covers GPU base images​

0.1.42​

Faster container starts on baked-image cache hits​

AWS disk tiers​

gp3 root volumes by default​

Baked-image cache opt-out​

Release hardening​

0.1.40​

AWS baked-image startup reliability​

Local Kubernetes release hardening​

0.1.39​

Bake reliability​

Quota and status​

Updates and release hardening​

0.1.38​

Quota and status internals​

0.1.37​

Local Docker-control​

Serve upgrades​

Target selection​

Baked images and registries​

Local API reliability​

Operational hardening​

0.1.36​

Choose which API server the CLI talks to​

CLI and API versions must match​

anycloud logs for VM-host deployments​

anycloud list distinguishes jobs and servers​

Credential prompts are more consistent​

0.1.35​

anycloud baked manages baked VM images​

anycloud list filters by credential​

Quota commands require an explicit credential​

anycloud status --verbose shows VM diagnostics​

Clearer GPU runtime handling​

Hosted API owner checks run before lookups​

0.1.34​

Local API Docker access is opt-in​

anycloud ssh restored​

Configurable anycloud serve listen port​

anycloud serve always runs servers on-demand​

Interactive target picker shows disk sizes​

anycloud status shows spot and disk size​

anycloud status shows why an image pull failed​

--bake accepts single-arch and digest-pinned amd64 images​

anycloud exec captures complete output more reliably​

Faster failures and cleaner failover for GPU and Lambda serve​

0.1.33​

anycloud serve on Lambda​

anycloud serve --bake​

Hosted API restricted to its owner​

0.1.32​

anycloud api serve deploys a hosted AnyCloud API​

Python SDK get_or_submit() for restart-safe workflows​

0.1.50

GPU jobs retry a not-ready host instead of failing

Lambda deployments reach newer regions and fall back across instance types

Lazy image loading now uses eStargz

0.1.49

Large images start sooner with SOCI v2

Azure bucket sync data now avoids temporary VM storage

Region-pinned deployments retry capacity sooner

Lambda launch retries clean up duplicate instances

Workload image baking has been removed

0.1.48

Preempted spot deployments no longer stay falsely `running`

Failing bucket syncs back off between retries

Bucket sync health now comes from rclone's own errors

Clearer terminal status for jobs with output buckets

0.1.47

`anycloud update` actually restarts the local API server again

Cross-cloud input buckets no longer get stuck in syncing

0.1.46

Bake-hit boots wait for the warm-up, and it targets the right files

0.1.45

Faster initialization on baked-image boots

0.1.44

Boot-stable warm-up for large baked images

Target selection

0.1.43

Baked-image warm-up now covers GPU base images

0.1.42

Faster container starts on baked-image cache hits

AWS disk tiers

gp3 root volumes by default

Baked-image cache opt-out

Release hardening

0.1.40

AWS baked-image startup reliability

Local Kubernetes release hardening

0.1.39

Bake reliability

Quota and status

Updates and release hardening

0.1.38

Quota and status internals

0.1.37

Local Docker-control

Serve upgrades

Target selection

Baked images and registries

Local API reliability

Operational hardening

0.1.36

Choose which API server the CLI talks to

CLI and API versions must match

`anycloud logs` for VM-host deployments

`anycloud list` distinguishes jobs and servers

Credential prompts are more consistent

0.1.35

`anycloud baked` manages baked VM images

`anycloud list` filters by credential

Quota commands require an explicit credential

`anycloud status --verbose` shows VM diagnostics

Clearer GPU runtime handling

Hosted API owner checks run before lookups

0.1.34

Local API Docker access is opt-in

`anycloud ssh` restored

Configurable `anycloud serve` listen port

`anycloud serve` always runs servers on-demand

Interactive target picker shows disk sizes

`anycloud status` shows spot and disk size

`anycloud status` shows why an image pull failed

`--bake` accepts single-arch and digest-pinned amd64 images

`anycloud exec` captures complete output more reliably

Faster failures and cleaner failover for GPU and Lambda serve

0.1.33

`anycloud serve` on Lambda

`anycloud serve --bake`

Hosted API restricted to its owner

0.1.32

`anycloud api serve` deploys a hosted AnyCloud API

Python SDK `get_or_submit()` for restart-safe workflows