Five months ago, Jensen Huang and Sam Altman announced a $100 billion partnership.

Ten gigawatts. Millions of GPUs. The biggest AI infrastructure deployment in history. This week, Huang told reporters in Taipei it was "never a commitment."

Who could possibly have foreseen that non-binding LOIs/MOUs counted for so little?

I'm Ben Baldieri, and every week I break down the moves shaping GPU compute, AI infrastructure, and the data centres that power it all.

Here's what's inside this week:

Let's get into it.

The GPU Audio Companion Issue #87

Want the GPU breakdown without the reading? The Audio Companion does it for you, but only if you’re subscribed. If you can’t see it below, click here to fix that.

Was NVIDIA and OpenAI's $100B Deal Always PR?

The non-binding LOI is "on ice." Turns out "never a commitment" and "$100 billion partnership" don't mean the same thing.

Last week, the WSJ reported that NVIDIA's September 2025 LOI with OpenAI has stalled. Why? Internal doubts about OpenAI's business discipline and broader competitive concerns. And this week? OpenAI, according to Reuters reporting, is “unsatisfied” with NVIDIA’s chips and is “looking for alternatives.“ While Huang and Altman sought to quell any anxieties this week, per CNBC reporting, the parallels with equally public celebrity breakups are hard to ignore. The difference there is that those relationships aren’t typically responsible for the overwhelming majority of GDP growth in a 36-month period.

Why this matters:

  • When the biggest player in AI chips publicly downgrades its biggest infrastructure commitment, the signal travels through every backstop, lease, and bond sale downstream. The Bloomberg deep-dive we referenced in October now reads like a warning label.

  • Both sides are diversifying hard. Cerebras, AMD, and Broadcom give OpenAI optionality they didn't have a year ago. Anthropic, hyperscalers, and multiple neocloud equity positions likely mean NVIDIA doesn't need any single customer to define its future.

  • The questions, therefore, are how much of the circular financing web will unravel, how quickly, and what that means not just for the AI ecosystem but for the economy at large.

SpaceX Absorbs xAI in Largest Corporate Merger Ever

$1.25 trillion, orbital ambitions, and a company that needed a bigger balance sheet.

SpaceX absorbed xAI via an all-stock deal on February 2nd. Pre-merger, xAI was burning ~$1B/month. Now, SpaceX unlocks a more favourable capital structure and a path to a combined IPO. The deal consolidates xAI's Memphis-area infrastructure under SpaceX, including a new $20B campus in Southaven, Mississippi, branded "MACROHARDRR" - its third facility in the region. SpaceX has separately filed with the FCC for up to one million solar-powered satellites as an "orbital data-centre system."

Why this matters:

  • The orbital compute thesis has a real logic problem to solve: terrestrial power and cooling constraints are gating every major AI infrastructure buildout on the ground. We covered the power-constraint angle extensively in Issue #52 and Issue #78. This is Musk's proposed solution.

  • Space offers near-unlimited solar power and passive thermal dissipation, though overall viability is a divisive topic - AWS CEO Matt Garman has been publicly sceptical about the economics, whereas Google's Project Suncatcher is testing radiation-hardened AI chips for a 2027 prototype.

  • SpaceX has an effective monopoly on commercial rocket launches. Starlink already operates ~7,000 satellites. No other entity on Earth can put hardware in orbit at this cost or cadence. If orbital compute works, SpaceX is the only company that can build it at scale, and xAI will be the only beneficiary. If it doesn't, they still own the AI lab, the data centres, and the connectivity layer.

CoreWeave and Lambda Target Production Workloads

The leading neoclouds are done being mere compute marketplaces.

CoreWeave launched ARENA, a production-scale testing environment where customers validate real workloads on purpose-built infrastructure, including early GB300 NVL72 access, before committing. Early results show 2x performance on GB300 versus prior gen, ~30% lower TCO, and 10x training time improvement versus a competing cloud on the same GPU.

Also this week, Lambda and Oumi partnered for end-to-end custom model development and deployment, automating the data synthesis, evaluation, and fine-tuning pipeline on Lambda's GPU infrastructure. The partnership also targets production deployments, though more from an enterprise standpoint.

Why this matters:

  • The neocloud value proposition is shifting from "we have GPUs" to "we run your AI in production." Raw compute is commoditising. Managed inference, workload optimisation, and pre-production testing are where margins live. We've tracked this progression since the CoreWeave profile in Issue #2 and Lambda in Issue #12 (both of which I’m working on updating). Both are now competing for enterprise production contracts, not researcher GPU reservations.

  • ARENA creates switching costs before the contract starts. If a customer validates 30% lower TCO on CoreWeave infrastructure, that's a sales tool and a lock-in mechanism in one.

  • Lambda's play targets enterprises that need a fine-tuned 7B model for their specific use case, not GPT-5. Most don't have the ML engineering team to build it. But with Oumi providing the automation, and Lambda providing the rack, they do.

Anthropic Ships Opus 4.6: Another Smartest Model

A 1M token context window, a lead over GPT-5.2 on real work tasks, and pricing that hasn't moved.

Claude Opus 4.6 is live, and it’s powerful. It can hold and reason across 1M tokens, and it leads every frontier model on agentic coding (Terminal-Bench 2.0), hard information retrieval (BrowseComp), and multidisciplinary reasoning (Humanity's Last Exam). New capabilities include adaptive reasoning depth, context compaction for long-running agents, and developer controls for the intelligence-speed-cost tradeoff. All while pricing remains $5/$25 per million tokens.

Why this matters:

  • 1M tokens of reliable context changes the build. Complex RAG pipelines exist because models lose the plot over long documents. If that failure mode disappears, enterprises can simplify their AI architecture and cut infrastructure costs. That's a compute story as much as a model story.

  • The number that matters most for enterprise buyers is GDPval-AA, which measures performance on actual knowledge work in finance, legal, and professional services. Opus 4.6 shows +144 Elo over GPT-5.2.

  • If Anthropic holds this lead, it could begin to reshape which model enterprise procurement teams default to. And when this could happen at the same time OpenAI is falling back on ads as a business model while sama is blowing up at Anthropic for pointing this out. Get the popcorn.

Positron and Cerebras Raise an Aggregate $1.23B

Alt-compute continues having a moment.

Positron AI closed a $230M Series B at $1B+ valuation, co-led by ARENA Private Wealth, Jump Trading, and Unless, with QIA, Arm, and Helena. A decent-sized round, for sure, but what’s more interesting is that Jump Trading went from customer to co-lead investor, with Jump's CTO saying Atlas delivered "roughly 3x lower end-to-end latency than a comparable H100-based system." Positron's next-gen Asimov chip targets tape-out late 2026 with 2TB+ memory per accelerator and 5x more tokens per watt than incumbents.

If this isn’t validation of GPU alternatives, I don’t know what is.

Why this matters:

  • $1.23B in a single week prices alt-compute as a real category. Positron targets memory-bound inference. Cerebras targets raw throughput. Both benefit from the same dynamic: enterprises want options beyond NVIDIA. We've been tracking this trend since the inference silicon coverage in Issue #69.

  • Trading firms don't forgive latency or cost overruns. Jump deploying Atlas, validating the results, then writing a co-lead cheque is the customer-to-investor pipeline every hardware startup wants and almost none achieve.

  • AMD investing in Cerebras while serving as OpenAI's second-source GPU partner suggests they see wafer-scale as complementary, not competitive. Or it's an intelligence investment. Possibly both.

AWS, Google, Microsoft Commit $600B+ in 2026 Capex

Amazon doubled its guidance. Google nearly doubled its spend. Microsoft kept writing cheques.

Amazon guided $200B of capex for 2026 - double last year. Google guided $175-185B, up from $91.4B in 2025. Microsoft spent $37.5B in Q2 alone, putting it on track for $150B+ in annualised spend. Combined, the three hyperscalers are heading for $600B+ in 2026 capex - a 36% YoY increase. Why? Unblocking constraints is expensive at the best of times, and doubly so when you’re part of a three-way competition between some of the most profitable businesses on the planet.

Why this matters:

Edinburgh Unanimously Rejects 210MW Data Centre

Same country. Same month. A tale of two data centres.

In Issue #86, we covered DataVita's North Lanarkshire designation as a UK AI Growth Zone, along with £8.2B investment, 500MW capacity, 1GW+ private-wire renewables, sub-10p/kWh power, and a £543M community fund. This week, Edinburgh went the opposite direction. The city’s development committee voted unanimously to reject a 210MW data centre proposed by Shelborn Drummond Ltd for the former RBS headquarters at South Gyle. The grounds? No total emissions analysis, no alignment with mixed-use planning, and sustainability measures dismissed as greenwashing.

Why this matters:

  • The difference is the approach. DataVita brought its own power, tackled emissions head-on with grid-positive design, and committed £543M in community benefits with independent local oversight. Shelborn Drummond relied on the grid and offered measures that the committee found inadequate.

  • The "green data centre" label has no standard definition. Councillors said so explicitly. Until regulators define what qualifies, planning committees will keep rejecting applications that don't stack up.

  • Edinburgh won't be the last rejection. The moratorium argument is gaining political traction. The DataVita model from Issue #86 is the template. Anything less is a planning risk.

The Rundown

Two things we took as given for years: OpenAI would win the model race, and NVIDIA would design the infrastructure beneath it.

Neither assumption looks as solid as it did six months ago. Anthropic just shipped a model that beats GPT-5.2 where it counts. The $100B deal that was supposed to cement the NVIDIA-OpenAI axis is on ice. Cerebras and Positron raised $1.23B from investors betting that NVIDIA's grip is loosening. And the neoclouds that built a business around renting NVIDIA GPUs by the hour are moving up the stack because naked compute alone is not enough. $600B in hyperscaler capex understands this, because GPUs are nothing if the grid can't keep up. And Musk's answer to that problem involves a million satellites, which is either visionary or unhinged. Possibly both.

See you next week.

Reply

Avatar

or to participate