anycloud
AI research has two bottlenecks: compute cost and iteration speed. anycloud solves both — pool your cloud credits across AWS, Azure, GCP, and Lambda, and run Docker workloads on spot GPUs from a single CLI or Python SDK.
All you need is Docker and cloud credentials. Pick a GPU type (e.g. H100, A100) and anycloud finds the cheapest available VM across clouds and regions.
import anycloud
from anycloud.types import CloudConfig
@anycloud.function(
image="ghcr.io/acme/my-training:latest",
gpu="h100:all",
cloud_config=CloudConfig(credentials="my-aws", spot=True),
)
def train(learning_rate: float):
...
job = train.submit(0.001)
job.wait()
See the full Python and CLI examples in Getting Started →
Why anycloud
💰 Dramatically cheaper. Spot instances save up to 80%. The same GPU can be 2x cheaper in a different region — anycloud picks the best option automatically. No managed-service markup. Built-in bucket sync replaces EBS at 4-7x lower cost at TB scale. These savings stack.
🔄 Iterate faster. The faster you run the experiment loop — run, analyze, repeat — the faster you make progress. That velocity compounds. anycloud lets you run hundreds of experiments a day across clouds.
🔒 You keep control. VMs run in your cloud account. Credentials stay on your machine, encrypted at rest, never sent anywhere. No vendor lock-in — switch clouds by changing one flag.
🤖 Agent-operable. Compact JSON output, filter flags on list (status, provider, agent, session), per-session spend caps, and read-only DB query. Claude Code, Codex, Cursor, and Aider get auto-tagged so you can filter and cap by session. See Agents.
How it works
- Install the CLI or Python SDK
- Connect your cloud accounts
- Submit a job — we provision bare VMs, pull your image, and start your container
- Your job runs — stream logs, SSH in, or monitor from your IDE
- When it finishes, the VM gets cleaned up automatically
For jobs that need data, anycloud syncs cloud storage buckets into your container as local directories — your code just reads and writes files. See Bucket Sync.
Checkpointing is just writing files to /mnt/checkpoint — anycloud syncs them to a bucket automatically. If your VM gets preempted, we spin up a new one, restore your files, and restart your container. Most ML frameworks already support this (PyTorch torch.save, TF tf.train.Checkpoint). See Spot Instances.
Your container gets:
/mnt/input— your input data (read-only)/mnt/output— write results here, synced to your bucket continuously/mnt/checkpoint— state that survives spot preemption