In partnership with

On the first day of Christmas, Jensen gave to me…a competitor acquisition wrapped in a licensing deal to avoid regulatory scrutiny?

Talk about finishing the year strong!

I’m Ben Baldieri. Every week, I break down what’s moving in GPU compute, AI infrastructure, and the data centres that power it all.

Here’s what’s inside this week:

Let’s get into it.

The GPU Audio Companion Issue #80

Want the GPU breakdown without the reading (or the ads)? The Audio Companion does it for you, but only if you’re subscribed. If you can’t see it below, click here to fix that.

Become An AI Expert In Just 5 Minutes

If you’re a decision maker at your company, you need to be on the bleeding edge of, well, everything. But before you go signing up for seminars, conferences, lunch ‘n learns, and all that jazz, just know there’s a far better (and simpler) way: Subscribing to The Deep View.

This daily newsletter condenses everything you need to know about the latest and greatest AI developments into a 5-minute read. Squeeze it into your morning coffee break and before you know it, you’ll be an expert too.

Subscribe right here. It’s totally free, wildly informative, and trusted by 600,000+ readers at Google, Meta, Microsoft, and beyond.

NVIDIA Acquihires Groq LPUs and Execs

NVIDIA has struck a non-exclusive licensing deal for Groq’s inference technology.

Groq frames it as licensing. CNBC calls it a $20bn asset acquisition. The reality likely sits somewhere in the middle:

NVIDIA gets the IP. NVIDIA gets the founders and senior leadership. Groq, the corporate shell, stays alive.

Jonathan Ross, Sunny Madra, and Groq’s core engineering team will join NVIDIA to fold Groq’s ultra-low-latency inference work into the NVIDIA AI Factory architecture. GroqCloud keeps running, but the technical centre of gravity now sits inside NVIDIA. And because this is not a corporate acquisition, the deal sidesteps the antitrust headaches that sank the ARM bid in 2022. No takeover of a competing cloud provider. No control issues.

Just IP and talent.

The timing is not subtle.

Google was finally mounting a credible push to break NVIDIA’s CUDA lock with TorchTPU. NVIDIA has now acquired part of the team that designed the original TPU and, crucially, an inference-specific architecture that doesn’t cannibalise Blackwell.

Why this matters:

  • This is an acqui-hire disguised as licensing. NVIDIA gets the people and the tech without triggering a full regulatory review.

  • Groq’s inference engine becomes part of NVIDIA’s roadmap, plugging a gap and attack vector as the market continues to shift from training to inference.

  • NVIDIA neutralises a challenger at the exact moment Google’s TPU effort was becoming a serious software threat. And it does so with a chip that complements, not competes with, its GPU franchise.

Musk Claims xAI Will Outcompute Everyone

Elon Musk says xAI will have more AI compute than “everyone else combined” within five years.

A bold claim for a slow news period, but not unprecedented. Colossus 2 in Tennessee is already tracking toward 400MW of compute. xAI has 230,000 GPUs live, including a 100,000-H200 cluster assembled in 19 days, and is targeting 50 million H100-equivalent GPUs over the next half decade. The company is also raising up to $20bn for additional NVIDIA supply.

As with all bold claims, execution risk is real.

Still, given Musk’s track record in this space alone, one would be forgiven for not wanting to bet against him.

Why this matters:

  • The scale and velocity put xAI far closer to hyperscaler territory than most startups, even if the “everyone combined” claim is unlikely.

  • A 2GW AI campus rivals the largest planned facilities from OpenAI, Microsoft, and Oracle.

  • Musk is committing capital, power, and industrial capacity at a rate few others can match, reshaping expectations for private AI-infra buildouts.

Nscale Locks in 40MW at WhiteFiber’s NC-1

Nscale has signed a 10-year, $865m agreement for 40MW of capacity at WhiteFiber’s NC-1 campus in Madison, North Carolina.

The site is a retrofit of a former Unifi manufacturing plant, now engineered for ultra-high-density AI loads (up to 150kW per rack) with a 99MW supply from Duke Energy. WhiteFiber has already invested $150m into the campus and expects to lock down debt financing in early Q1 2026. The first 20MW bills from April 2026, with the second 20MW in May. Nscale also gets first refusal on future capacity as NC-1 scales toward 200MW.

This is Nscale’s latest large anchor move after signing major capacity deals with Microsoft and OpenAI across Norway, Iceland, Portugal, and Texas.

WhiteFiber, meanwhile, is positioning NC-1 as the foundation of a broader US buildout.

Why this matters:

  • Nscale now has a multi-hundred-million-dollar foothold in the US, expanding beyond its European base.

  • WhiteFiber gains immediate commercial validation for its retrofit strategy, plus the leverage needed to close financing for NC-1 and additional sites.

  • All roads lead to the US for UK neoclouds, with first Fluidstack and now Nscale expanding directly on US soil.

Z.ai GLM-4.7 Enters Open Source SOTA Ring

Z.ai has released GLM-4.7, arguably the strongest non-proprietary model release of the quarter.

The headline gains land in coding and tool use: 73.8 percent on SWE-Bench Verified, 66.7 percent on SWE-Bench Multilingual, and a sizeable jump on Terminal Bench 2.0. UI generation also gets a lift, and the model introduces “Preserved Thinking,” a mechanism that keeps reasoning state across turns in long agentic workflows. Benchmarks across math, reasoning, coding, and agents put GLM-4.7 on a level with Claude Sonnet 4.5, DeepSeek V3.2, Gemini 3.0 Pro, and GPT-5-tier models, while remaining fully open under an MIT licence.

Why this matters:

  • Another SOTA open model reduces reliance on proprietary agents, especially for coding-heavy workloads.

  • GLM-4.7 strengthens the competitive pressure already created by DeepSeek’s recent releases, raising the floor for OSS model quality.

  • With Meta potentially pulling back on open releases, Zhipu is positioning itself as a credible successor in the open frontier tier.

AI Is About to Eat 20% of Global DRAM

AI workloads could soak up the equivalent of 20% of all DRAM wafer capacity in 2026.

According to a report from the Commercial Times, a Taiwanese publication, the crunch is not due to shipments but to the impact of HBM and GDDR7 on the supply chain. Each GB of HBM eats 4x the wafer capacity of standard DRAM. GDDR7 eats 1.7x. This means that even modest HBM and GDDR7 shipments translate into disproportionate strain on global fabs.

TrendForce puts total DRAM output at 40EB next year.

AI’s “equivalent usage” would take nearly a fifth of it, while DRAM production only grows 10-15% annually. That gap pushes shortages and price tension into DDR5 for PCs, smartphones, and servers.

Why this matters:

  • Memory, not compute, is becoming the hard cap on AI scale, with HBM dictating cluster deployment speed.

  • AI’s wafer pull tightens supply for DDR5, pushing up costs and constraining PCs, laptops, and smartphone upgrades.

  • 2026 could see consumer devices start to ship with smaller RAM configs and higher prices as fabs prioritise high-margin AI memory.

Google Buys Intersect Power for $4.75bn

Google is buying Intersect Power for $4.75bn, securing “several gigawatts” of in-development energy and data centre projects across the US.

Alphabet will own the full development pipeline, while Intersect’s existing operating assets in Texas and California stay outside the deal as a separate entity. Intersect will continue to run independently under CEO Sheldon Kimber, but will co-develop new sites with Google’s technical infrastructure team, including the colocated data centre and energy project already under construction in Haskell County, Texas.

Why this matters:

  • Hyperscalers are now buying developers outright to secure energy pipelines, not just signing PPAs.

  • This move also pressures Google’s rivals to secure their own upstream energy positions as grid queues stretch into the 2030s.

  • AI’s power needs are forcing ever more vertically integrated business models, from shovels in the dirt to models and silicon.

Fermi America Moves from IPO to Legal Investigation

Fermi America is facing its first real reckoning.

Robbins Geller has launched a securities investigation after the anchor tenant for Fermi’s proposed 11GW Amarillo campus walked away from a non-binding LOI worth around $150m. The deal expired when exclusivity lapsed, but confusion deepened after Fermi reportedly told Business Insider the tenant was Amazon, then later issued a public denial. The firm will now review whether investors were misled about the LOI or its role in financing plans.

Why this matters:

  • We’ve been tracking Fermi since June. It’s a bold vision, to say the least. This first major stress event, however, exposes how brittle pre-revenue multi-gigawatt projects can be when offtake evaporates.

  • Regardless of the outcome, a probe like this raises questions over Fermi’s disclosures and its ability to finance Project Matador without a contracted anchor.

  • If the project goes south, and given the number of opportunists we see piling into the space, this likely won’t be the last GW-scale project we see collapse. Think canary-in-the-coalmine.

The Rundown

What a way to finish the year.

So much has happened in these twelve short months that the landscape looks almost unrecognisable. And the same can also be said for The GPU.

What started as a side hustle has grown into something much bigger.

And none of that would be possible without you: the reader.

Thankyou so much for your continued support over this year.

It means a huge amount!

To that end, I’ve been approached by a lot of people about launching a podcast next year. Plus others interested in sponsoring. If you could answer the question below, I’ll be able to gauge interest!

Would you listen to a podcast from "The GPU" in 2026?

I'd be looking to interview those at the sharp end of AI infrastructure. Neoclouds, data centres, hyperscalers, chip providers, etc. But only if my audience thinks it's worth it.

Login or Subscribe to participate

Reply

or to participate