In partnership with
Sixty people. $6.3 billion. No model.
First payment is Wednesday. Nobody knows what runs on top. Or what the path to revenue looks like.
Given OpenAI’s leaked financials, there is a roughly 50% chance that does not matter.
Because who cares about a path to profitability in this space, right?
I'm Ben Baldieri. Every week I break down the moves shaping GPU compute, AI infrastructure, and the data centres that power it all.
Here’s what’s inside this week:
Let's get into it.
Today's issue is brought to you by Deliverance AI
The GPU runs on sponsorships. If you value independent, no-fluff analysis of the AI infrastructure market, the sponsors are who make it possible. I'd appreciate you checking them out.
Agentic AI dominates demos and pilots. In regulated industries like financial services, defence, telecoms, and government, almost none of it runs in production with real data, real decisions, and a full audit trail. The evening examines that gap, and how sovereign, governed agentic AI gets from pilot to production at scale, turning trust into measurable P&L impact.
Tuesday 30 June at Sea Containers London. A closed-room peer session for senior operators in regulated industries. Confirmed in the room:
BlackRock and Blackstone (financial services)
Datadog, AT&T, and Hitachi Energy (enterprise and technology)
The Cabinet Office and the MOD (government)
WWT, HPE, and NVIDIA (strategic partners)
Microsoft and AWS (hyperscalers)
Questions on the table:
Beyond the pilot: what changes when an agent has to outlive six months in production.
Sovereignty in practice: keeping models and sensitive data inside the enterprise boundary.
From cost centre to P&L line: what governed agents actually return, and how fast.
Governance by design, or after the fact: how you grade agents that grade themselves.
Keynote, practitioner panel, closing remarks from Mick McNeil, co-founder and CEO of Deliverance AI. Discussion and networking over food and drinks before and after.
Limited spots reserved for The GPU community.
Reflection Bets $6.3bn on a Model That Doesn't Exist
The most speculative tenant in SpaceX's new compute book is a sixty-person lab with nothing shipped.
Reflection AI agreed to pay SpaceX $150 million a month from 1 July 2026 through 2029 for GB300 access at Colossus 2, the Memphis supercomputer SpaceX's AI division runs as a commercial platform. Full term, around $6.3 billion, with a 90-day exit after the first three months.
Why this matters:
Reflection is a sixty-person New York lab founded in 2024 by Misha Laskin (ex-DeepMind Gemini) and Ioannis Antonoglou (co-creator of AlphaGo), last valued at $25 billion in a round NVIDIA is backing.
What it sells today is Asimov, a code-comprehension agent behind a waitlist; the valuation prices an unreleased frontier open-weight model pitched as America's answer to DeepSeek.
SpaceX has turned Colossus into a multi-tenant platform: Anthropic at ~$1.25bn/month, Google at ~$920m, now Reflection at $150m. The marginal tenant is now a pre-product bet, which tells you how scarce GB300 supply is. NVIDIA wins either way: it backs Reflection and sells the GB300s into Colossus, booking revenue whether the model ships or not. The chip company is now the largest LP in the model companies buying its chips.
Ornn Turns Compute Into a Tradeable Commodity
The company that built the first transacted price for GPU compute just built the place to trade it.
Ornn raised a $33 million second seed round co-led by a16z crypto and Galaxy Ventures. The raise landed with Ornn Compute, a marketplace aggregating GPU capacity from public clouds and neoclouds, with a secondary market for transfers and on-demand sublets. Operators sell into a basket of tenants under one offtake contract instead of underwriting each one individually. It pairs with the Ornn Compute Price Index, the transaction-based GPU benchmark on the Bloomberg Terminal since April. ICE has signed on to clear futures and options against the index, pending regulatory approval.
Why this matters:
GPU compute is being priced like a commodity, not a cloud contract.
Oil took a century to build a transparent price, a forward curve, and a hedging layer; Ornn is assembling all three in two years.
A basket offtake turns a fragmented book of bilateral deals into something a lender can model. Same structural shift that let IREN issue A-rated securitised GPU debt at 3.31% (Issue #109), pushed down to the operators who cannot yet access that market.
Groq Raises $650M to Run NVIDIA's Version of Its Own Chip
Six months after NVIDIA paid $20 billion to license Groq's chip and hire its founder, Groq raised $650 million to run that chip as a cloud.
Groq raised $650 million in growth capital led by Disruptive and Infinitum, the two board investors. In December 2025, NVIDIA signed a non-exclusive licence for Groq's LPU technology at around $20 billion and hired founder Jonathan Ross, president Sunny Madra, and most of the senior engineers. At GTC in March, NVIDIA shipped the LPX inference platform built on the licensed Groq tech. What is left is a cloud business: 13 data centres, five million developers, scaling toward 200MW by end-2027 on NVIDIA's LPX, under CEO Adam Winter. Last valued at $6.9 billion; no new valuation disclosed.
Why this matters:
NVIDIA absorbed a competitor without buying it. The licence took the IP, the founders, and the architecture; the cash payout left investors free to fund a neocloud now running that architecture as a tenant (Issue #111, Issue #105).
The bet is that inference demand is large enough to make an operator out of a chip company.
200MW by end-2027 is a fraction of the gigawatt builds going up around it, and Groq is scaling on hardware it licensed away.
Baseten Raises $1.5B Running Other People's Models
The week the chipmakers fought over their own silicon, the company that runs anyone's raised $1.5 billion.
Baseten raised $1.5 billion in a Series F at a $13 billion valuation, led by Altimeter, Conviction, and Spark, co-led by Sands and Wellington. Two tranches at $13bn and $11bn, the split-price structure standard for this year's mega-rounds. Fourth raise in 18 months. Revenue up 20x year on year, inference volume up 40x, more than a billion calls a day across 87 clusters on 18 different clouds. Baseten does not own GPUs and does not train models; customers include Cursor, Notion, Lovable, Harvey, HubSpot, OpenEvidence, Abridge, and Decagon.
Why this matters:
The inference economics under last week's biggest deal. Cursor, which SpaceX bought for $60 billion (Issue #111), is a Baseten customer. When the application layer consolidates, the serving layer underneath gets more valuable.
Sands and Wellington as co-leads are the signal. Crossover investors come in before a public listing, not after a Series A. A fourth round in 18 months at this price reads as pre-IPO positioning.
NVIDIA backed Baseten's Series E in January. The same company shipping LPX through Groq, selling Blackwell into AWS, and watching OpenAI design around it also holds a stake in the platform-agnostic layer routing workloads across all of it.
OpenAI Built Its Own Inference Chip in Nine Months
OpenAI announced custom inference silicon on Tuesday.
OpenAI and Broadcom announced Jalapeño, OpenAI's first custom accelerator, designed from a blank slate for LLM inference. Engineering samples are running workloads in the lab at production frequency and power, including GPT-5.3-Codex-Spark.
Why this matters:
Google has TPUs, Amazon has Trainium, Meta has MTIA, Microsoft has Maia.
OpenAI was the last frontier-scale buyer without its own chip. No longer.
The headline is the timeline: nine months from design to tape-out, which OpenAI says is the fastest cycle ever for an advanced-node, high-performance ASIC, achieved partly by using its own models to accelerate the design.
Alibaba Crosses 105 AZs With a New Paris Region
Alibaba quietly opened its third European cloud region this week.
Alibaba Cloud launched a France region in Paris with two availability zones, built to European data-sovereignty standards. With a new fifth Japanese data centre, the global footprint is now 105 availability zones across 32 regions, up from 92 across 29 at end-2025. It sits inside the standing $52.7 billion global cloud programme CEO Eddie Wu set out in early 2025.
Why this matters:
This is a hedge against export politics. The blacklist question keeps resurfacing; Alibaba's answer is to build sovereign-compliant regions inside markets that might otherwise close. Same sovereign-compute logic as Evroc and the UK funds, pointed in the opposite direction.
105 AZs is a different kind of scale from a single gigawatt campus. US AI concentrates into a handful of enormous sites with captive power. Alibaba distributes across 32 regions, the cloud-era playbook rather than the AI-factory one. For inference and sovereign workloads, the more defensible shape.
The capital figure is real but old. $52.7 billion is a standing programme from early 2025, not fresh money, and modest against US hyperscaler 2026 capex. The story is where the zones are going, not the cheque.
Microsoft Buys 2.67GW of Gas to Skip the Grid
Microsoft signed a 20-year deal for its own gas plant, because the grid cannot move fast enough for AI.
Chevron subsidiary Energy Forge One signed a 20-year PPA with Microsoft to build a 2.67GW co-located gas plant in West Texas, first power expected 2028. The project, Kilby, uses GE Vernova turbines and Caterpillar's Solar Turbines, built in phased modular stages. It pairs with the 2GW campus planned for Pecos. The structure delivers dispatchable power that does not wait in an interconnection queue.
Why this matters:
The grid-queue problem solved with a chequebook. FERC ordered grid operators to defend or revise large-load interconnection rules within 30 days (Issue #111), but a 20-year PPA for a co-located plant skips the queue entirely.
Gas won the dispatchable-power argument for AI. DOE committed $17.5bn to AP1000 reactors for 11GW of baseload; Microsoft signed for 2.67GW of gas with first power years before any new nuclear arrives.
Chevron is the name to watch. An oil major standing up a subsidiary to sell dispatchable power to a hyperscaler is the energy industry repricing itself around compute demand.








