How Ångström used Anycloud to beat Meta in AI crystal structure prediction

June 17, 2026 · 7 min read

Co-founder, anycloud

Co-founder & CTO, Ångstrom AI

Ångstrom AI (YC S24), with the University of Cambridge (the Csanyi group) and AstraZeneca, released DFT Accuracy on Crystal Structure Prediction with Machine Learning Interatomic Potentials. The paper presented CSP-MACE-Å, a machine learning model designed to replace DFT, the expensive quantum mechanical calculation at the heart of crystal structure prediction, with the same accuracy but a 10,000x speedup.

CSP-MACE-Å also significantly outperformed UMA-OMC on crystal-structure prediction benchmarks. UMA is Meta's general purpose model for atoms and molecules; UMA-OMC is the version adapted for organic molecular crystals.

Ångstrom built CSP-MACE-Å on anycloud, a CLI that runs GPU workloads across your own cloud accounts. The team ran more than 100,000 GPU hours, almost entirely on multi-cloud spot. Researchers also used agents through the same anycloud CLI to launch and monitor batches, retrieve results, and help drive the experiment loop.

Why crystal structure prediction matters

Crystal structure prediction (CSP) answers a deceptively simple question: given a molecule, what solid crystal structures can it form? It matters because one molecule can pack into different crystal structures (known as polymorphs) with different physical characteristics. This creates a major risk for pharmaceutical development, especially when late-appearing forms emerge during manufacturing or storage and alter product performance. In 1998, that nearly sank the HIV drug ritonavir. The drug had to be pulled and reformulated when an unexpected, more stable crystal form of the same molecule appeared 2 years after market release. This cost Abbott more than $250 million. Veritasium tells the story well in its The Crystal That Could Destroy All Medicine video. It is imperative for drugmakers to map all the possible crystal forms of a molecule before release in order to derisk the possibility of an unexpected shift to a more stable form later on that may render the drug unusable once it has been distributed.

The workhorse of CSP is DFT (density functional theory). DFT is a quantum-mechanical calculation that serves as the gold standard for CSP in industry and academia. However, DFT is extremely expensive and slow. The calculations for one molecule can take days to weeks, which slows down the scientists using it, and caps how many structures they can explore.

Ångstrom’s machine learning model, CSP-MACE-Å, is 10,000 times faster than DFT. Calculations go from taking weeks with DFT to minutes with CSP-MACE-Å. Not only does this save scientists time, but it ultimately means that far more candidate crystal structures may be evaluated, providing greater confidence when derisking crystal forms.

CSP-MACE-Å was also shown to outperform Meta's UMA-OMC model across Ångstrom’s and AstraZeneca's evaluation suites. Meta's UMA-OMC was the previous state of the art machine learning interatomic potential for CSP, however its accuracy was inferior to gold standard DFT. CSP-MACE-Å is the first model to demonstrate the accuracy of DFT for CSP, delivering a massive speed improvement without sacrificing accuracy.

The agent-driven experiment loop

The bottleneck to develop the CSP-MACE-Å model at Ångstrom is the speed at which Ångstrom can iterate on the loop that underlies many AI research orgs: Forming a hypothesis, deciding what computational experiments to run to test it, launching the GPU jobs, pulling results back, analyzing the results and deciding on the next hypothesis to test. All the while, having to additionally reduce GPU costs, and manage hardware failures (and bugs!).

Ångstrom researchers used agents to help execute that loop. Researchers decided what computational experiments to run, which batches of jobs to launch, what outputs to compare, and what plots and metrics would answer the current question. Agents handled the work between those decisions: launching batches of anycloud jobs, monitoring status, downloading results, and generating plots and summaries for the next research decision.

This used the same local anycloud CLI and cloud configuration the team used by hand. The researchers stayed focused on the experiment plan and interpretation while agents handled the fan-out and bookkeeping between decisions. However, the same fan-out that made the loop fast also made it dangerous: the wrong batch of GPU jobs could become thousands of dollars of real spend before anyone noticed.

How anycloud kept the AI research experiment loop under control

“anycloud gives me the confidence to really let my agents loose without stressing that they will burn through all our compute. These days they continue to work throughout night, autonomously managing my research experiments, while I sleep."

Laurence Midgley, Co-founder & CTO, Ångstrom AI

One feature Ångstrom particularly values is spend controls scoped to the agent session. This proved to be the necessary primitive to give researchers the level of control to comfortably let their agents manage research experiments autonomously. It was not "give the agent cloud access", it was "let this session spend up to X today / Y per hour." That matters because a runaway agent could keep launching GPU work which could turn into thousands of dollars of wasted cloud spend.

anycloud let the team set two independent guardrails for each agent session: a throttle to limit concurrent GPU jobs, and a budget to cap total spend over a calendar window. When either cap is hit, new jobs wait in the queue; running jobs keep going.

Throttle

Limits concurrent jobs

Measures: estimated live $/hr
Blocks at: running + next VM >= cap
Clears: when running VMs finish

$ anycloud throttle set 300 --agent-session

Throttle  $/hr right now
  per agent-session  cap $300/hr each
    csp-rank            ██████████  96%   $289.25 / $300/hr     $10.75/hr headroom

Budget

Limits total spend

Measures: settled + estimated spend
Blocks at: period spend >= cap
Clears: at UTC reset

$ anycloud budget set 4000 --per day --agent-session

Budget  day  (resets in 19h)
  per agent-session  cap $4,000 each
    csp-rank            ██████████  96%    $3,848 / $4,000      $152 left
    free-energy-rerank  ████████░░  78%    $3,136 / $4,000      $864 left
    blind-test-eval     █░░░░░░░░░   9%      $368 / $4,000      $3,632 left

Slack makes spend and blocked work visible without watching a terminal. anycloud notifications enable slack --webhook ... posts a digest for the prior UTC day with compute spend and VM usage, initial submissions and resubmissions, the outcomes of ended deployments, and how many of those deployments were preempted at least once during their run. If a budget or rate cap starts blocking new jobs, anycloud posts a waiting-on-spend-cap alert. A daily budget block clears at the next daily reset; a rate-cap block clears when live spend falls back under the configured ceiling. Caps block only new jobs; already-running jobs keep running.

anycloudAPP9:00 AM

anycloud usage on 2026-07-22 (UTC)

👤 User(s): angstrom💰 Compute spend: $29.47 (est.)🖥️ VM usage: 132.5h across 202 provisioned VM attempts📦 Deployments🚀 824 initial submissions🔁 37 resubmissions🏁 813 ended✅ 31 completed · ❌ 0 failed · ⚠️ 2 errored · ⏹️ 780 terminated⚡ 40 were preempted at least once during their run (5%)⏱️ Median duration: 2.6h

What this unlocked for Ångstrom

"Our monthly compute spend is often more than 2x higher than our cash burn - so compute cost is a serious problem for us. anycloud has been critical for letting us use our credits across all major providers efficiently. We run our experiments almost exclusively on spot, which has significantly extended our compute runway. The bottleneck for an AI research company is the rate at which we iterate on the run experiments -> analyse results -> plan next experiments loop - anycloud lets us orchestrate hundreds of experiments each day."

Laurence Midgley, Co-founder & CTO, Ångstrom AI

Two problems sit behind Laurence's quote: cost and iteration speed. Cloud credits are the cheapest GPUs a startup will ever touch, but they're stranded - spread across providers, each with its own quotas, regions, and spot pools. And the rate of research is capped by how fast you can run the next batch of experiments. anycloud schedules work across every connected account, selects the cheapest available capacity, and runs on spot without requiring the workload to care which cloud it lands on. At Ångstrom, agents drove the research loop by calling the anycloud CLI directly against the team's own clouds.

At Ångstrom, agents accelerated the experiment loop while anycloud provided the multi-cloud, spot, and spend-control layer underneath it.

Why crystal structure prediction matters​

The agent-driven experiment loop​

How anycloud kept the AI research experiment loop under control​