Skip to main content

anycloud

AI research has two bottlenecks: compute cost and iteration speed. anycloud solves both — pool your cloud credits across AWS, Azure, GCP, and Lambda, and run Docker workloads on spot GPUs from a single CLI or Python SDK.

All you need is Docker and cloud credentials. Pick a GPU type (e.g. H100, A100) and anycloud finds the cheapest available VM across clouds and regions.

import anycloud
from anycloud.types import CloudConfig

@anycloud.function(
image="ghcr.io/acme/my-training:latest",
gpu="h100:all",
cloud_config=CloudConfig(credentials="my-aws", spot=True),
)
def train(learning_rate: float):
...

job = train.submit(0.001)
job.wait()

See the full Python and CLI examples in Getting Started →

Why anycloud

💰 Dramatically cheaper. Spot instances save up to 80%. The same GPU can be 2x cheaper in a different region — anycloud picks the best option automatically. No managed-service markup. Built-in bucket sync replaces EBS at 4-7x lower cost at TB scale. These savings stack.

🔄 Iterate faster. The faster you run the experiment loop — run, analyze, repeat — the faster you make progress. That velocity compounds. anycloud lets you run hundreds of experiments a day across clouds.

🔒 You keep control. VMs run in your cloud account. Credentials stay on your machine, encrypted at rest, never sent anywhere. No vendor lock-in — switch clouds by changing one flag.

🤖 Agent-operable. Compact JSON output, filter flags on list (status, provider, agent, session), per-session spend caps, and read-only DB query. Claude Code, Codex, Cursor, and Aider get auto-tagged so you can filter and cap by session. See Agents.

How it works

  1. Install the CLI or Python SDK
  2. Connect your cloud accounts
  3. Submit a job — we provision bare VMs, pull your image, and start your container
  4. Your job runs — stream logs, SSH in, or monitor from your IDE
  5. When it finishes, the VM gets cleaned up automatically

For jobs that need data, anycloud syncs cloud storage buckets into your container as local directories — your code just reads and writes files. See Bucket Sync.

Checkpointing is just writing files to /mnt/checkpoint — anycloud syncs them to a bucket automatically. If your VM gets preempted, we spin up a new one, restore your files, and restart your container. Most ML frameworks already support this (PyTorch torch.save, TF tf.train.Checkpoint). See Spot Instances.

Your container gets:

  • /mnt/input — your input data (read-only)
  • /mnt/output — write results here, synced to your bucket continuously
  • /mnt/checkpoint — state that survives spot preemption

Get Started →