In partnership with

$2B buys you a lot in this market.

5GW of future capacity. Early Rubin access. Having one’s software baked into every reference architecture that the world's largest chip company ships.

Not a bad position to be in for a company that went public less than a year ago.

Debt burden aside, of course.

Here’s what’s inside this week:

Let’s get into it.

The GPU Audio Companion Issue #86

Want the GPU breakdown without the reading? The Audio Companion does it for you, but only if you’re subscribed. If you can’t see it below, click here to fix that.

packet·ai · GPU pricing built on utilisation, not fixed slices

The GPU #59 profiled hosted·ai: optimise utilisation, provision on-demand instead of fixed slices, undercut cloud pricing.

packet·ai is that thesis live. Powered by hosted·ai.

See if the numbers hold up → packet.ai

NVIDIA Puts $2B Into CoreWeave

NVIDIA is betting bigger on its favourite neocloud.

Team Green has invested $2B in CoreWeave. A big number, but the money is secondary to the strategic alignment: CoreWeave will build 5+ GW of AI factories by 2030 using NVIDIA's full stack, including early adoption of Rubin GPUs, Vera CPUs, and Bluefield storage. In return, NVIDIA will help accelerate CoreWeave's land, power, and shell procurement. The kicker: NVIDIA will test and validate CoreWeave's software (SUNK and Mission Control) for potential inclusion in NVIDIA's reference architectures offered to other cloud partners and enterprises.

Why this matters:

  • 5GW by 2030 is enormous. CoreWeave now appears to be positioned as NVIDIA's primary channel for AI factory buildouts outside hyperscalers.

  • If CoreWeave's orchestration stack becomes part of NVIDIA's reference architecture, every enterprise and CSP deploying NVIDIA infrastructure could end up running CoreWeave software. That's a distribution moat no other neocloud can match.

  • This deepens an already close relationship while other neoclouds compete for scraps. The gap between CoreWeave and the rest of the market just widened again.

Anthropic: Government Contracts & $3B Piracy Lawsuits

One week, two very different headlines.

Why this matters:

  • Regulated, liability-heavy environments like public services demand a different bar than consumer chatbots. Anthropic landing this over competitors is a credibility marker, even as copyright litigation piles up.

  • The copyright cases aren't slowing deployment. $1.5B from Bartz, potentially $3B more from music publishers. For a company valued at $183B, these are costs of doing business, not existential threats.

  • "AI safety company" versus "built on piracy" is the framing war. The lawsuit explicitly calls out Anthropic's safety branding while alleging industrial-scale copyright infringement. That tension will define how regulators, partners, and the public evaluate frontier labs going forward.

The Next Tier of Neoclouds Is Raising Serious Money

This week’s news aside, it’s important to remember that not every GPU infrastructure play is CoreWeave or Nebius - the tier below is filling up fast.

PaleBlueDot AI, a Palo Alto-based compute platform founded in 2024, closed a $150M Series B led by B Capital, crossing the $1B valuation mark. Revenue grew 10x last year. The company operates across North America, Japan, Korea, and Southeast Asia, positioning itself for enterprise inference workloads. Separately, Zettabyte announced a strategic investment from Headline Asia to fund its expansion in Japan, including its TITAN AI data centre project and partnerships with Japanese telecom carriers. Zettabyte's zWARE software helps GPU operators manage utilisation, visibility, and power-aware operations across large deployments, the operational layer that many infrastructure owners lack.

Why this matters:

  • The biggest players sit at the top of the market with hyperscaler contracts and multi-billion dollar valuations. Below them, a second tier is emerging: regional specialists, software-differentiated players, and infrastructure operators targeting specific verticals or geographies. PaleBlueDot and Zettabyte fit this mould.

  • Japan is the common thread. Both companies are prioritising Japanese expansion. The country is entering a new AI infrastructure build cycle driven by power constraints and enterprise demand, but lacks the neocloud density of the US or Europe. First movers have room to run.

  • Raw GPU capacity is now a commodity, so software is the differentiator. Zettabyte's pitch is operational software for GPU fleets. PaleBlueDot emphasises its "AI Cloud Agent" and multi-tenant architecture.

UAE releases "sovereign" open model

Abu Dhabi just dropped a fully transparent AI model, and it's competitive.

Why this matters:

  • The open model race is now a three-way race. Chinese labs (DeepSeek, Alibaba) overtook US rivals in open models last year. Meta has pulled back from publishing research. MBZUAI is positioning the UAE as the transparency alternative. Full training recipes, no black boxes.

  • Countries and enterprises alike increasingly don't want to depend on US or Chinese models. K2 Think offers a third option: competitive performance with complete visibility into how it was built.

  • MGX investments in OpenAI and Stargate, infrastructure partnerships with Microsoft and BlackRock, and now a homegrown model competitive at the frontier. If the trend continues, the UAE looks to be showing the world just how powerful compounding at national scale can be.

FuriosaAI Begins Shipping Inference Chips at Scale

The Korean chip startup just hit a milestone most alt-compute companies never reach: mass production.

FuriosaAI has received its first 4,000 RNGD chips from TSMC and ASUS, now shipping to enterprise customers as both PCIe cards and turnkey servers. The pitch: 512 INT8 TFLOPS at just 180W TDP, compared to 600W+ for high-end GPUs. That means 3.5x greater compute density than H100 systems in standard air-cooled racks. The NXT RNGD Server packs 8 cards into a 4U chassis, drawing just 3kW total, letting customers stack five servers per rack for 20 petaFLOPS without infrastructure upgrades. LG AI Research validated 2.25x better performance-per-watt than comparable GPUs running their EXAONE model. FuriosaAI also ran OpenAI's 120B parameter GPT-OSS on just two RNGD cards.

Why this matters:

  • Most enterprise data centres are air-cooled and capped at 15kW per rack, at most, and liquid cooling retrofits are expensive and slow. That renders these facilities fundamentally unsuitable for the latest hardware generations.

  • FuriosaAI solves that problem by targeting the gap between "need inference at scale" and "can't rebuild the data centre".

  • But hardware is only part of the puzzle - if the software stack is immature (hi AMD), it doesn’t matter how good the specs are. FuriosaAI mitigates this risk by offering torch.compile support, vLLM drop-in replacement, and OpenAI API compatibility. Impressive on paper, but we’ll have to wait to see how the market responds.

Microsoft Unveils Maia 200 for Inference at Scale

Maia 200 is Microsoft's second-generation AI accelerator, built on TSMC 3nm with 140B+ transistors. 216GB HBM3e at 7 TB/s, 272MB on-chip SRAM, 10+ petaFLOPS in FP4, 5+ petaFLOPS in FP8, all within a 750W envelope. Microsoft claims 3x the FP4 performance of AWS Trainium 3, FP8 above Google's TPU v7, and 30% better performance per dollar than the latest hardware in their current fleet. Maia 200 is already deployed in Des Moines, with Phoenix next. OpenAI's GPT-5.2 models will run on it, along with Microsoft's Superintelligence team for synthetic data generation and in-house model development.

Why this matters:

  • The hyperscalers are serious about custom silicon. Google has TPUs. Amazon has Trainium. Microsoft now has Maia at a competitive scale.

  • All of these could loosen NVIDIA's grip on the inference market. Even after the Groq deal.

  • If these custom chips achieve the same success as Google’s TPU v7, the market will suddenly have more ways to avoid paying the NVIDIA toll.

Scotland Gets Its First AI Growth Zone in DataVita

The Scottish data centre developer's site has been designated a UK AI Growth Zone, unlocking £8.2B in private investment, one of the largest technology investments ever announced in Scotland. The project will deliver 500MW of AI-ready data centre capacity, 1GW+ of privately wired renewable energy (wind, solar, battery), and purpose-built Innovation Parks for robotics, labs, and advanced manufacturing. A £543M community fund over 15 years comes attached, overseen by an independent local board.

Why this matters:

  • The scale of the project is exciting, but the energy costs are what really matter: below 10p/kWh with carbon intensity 97% lower than the London grid average, near-zero water consumption and a target PUE of 1.15.

  • 3,400 jobs and a community fund with local governance also looks to be the winning political model for AI infrastructure buildout.

  • Data centres typically face NIMBY resistance across Europe. Attaching visible local benefits (apprenticeships, venture capital, community spending) is how you address it, secure planning approval, and build public support.

The Rundown

I keep waiting for the week when everything points in the same direction. This wasn't it.

A chipmaker deepened its bet on a single cloud provider, while a hyperscaler shipped silicon specifically to reduce its dependence on that chipmaker. A frontier lab landed government work while fighting billions in copyright claims. An open model from the Gulf matched performance benchmarks set by labs spending 100x more. An inference chip startup crossed from "promising specs" to "units delivered."

The more this market matures, the thicker the hedges seem to become.

See you next week.

Reply

or to participate