Jensen’s GTC keynote laid it out: the future is inference-driven AI.
Tokens per second per megawatt. That’s the metric NVIDIA’s banking on with Blackwell Ultra and Rubin Ultra. And it’s not just about performance, it’s about profitability.
Meanwhile, CoreWeave’s IPO is priced to raise $2.2 billion, OpenAI’s Stargate project is scaling up to handle 400,000 Nvidia chips, and Oracle just dropped $5 billion on UK cloud infrastructure.
Oh, and Perplexity is chasing an $18 billion valuation.
I’m Ben Baldieri, and every week I break down the moves shaping GPU compute, AI infrastructure, and the data centres that power it all.
Here’s what’s inside this week:
Let’s get into it.
Want the GPU breakdown without the reading? The Audio Companion does it for you—but only if you’re subscribed. Fix that here.
Training is dead. Long live inference.
NVIDIA CEO Jensen Huang shared breakthroughs in AI, robotics, and accelerated computing at #GTC25.
See how NVIDIA and partners are tackling humanity's most pressing challenges. 🌎
Watch the keynote replay > nvda.ws/4iK9mvJ— NVIDIA GTC (@NVIDIAGTC)
5:53 PM • Mar 20, 2025
Jensen Huang didn’t mention AI training once during his Nvidia GTC 2025 keynote. Instead, the focus was on AI factories, tokens/second/MW, and scaling inference workloads efficiently. And by efficiently, he means profitably.
Nvidia Dynamo is the key piece here, unifying inference workloads across entire clusters. Pair that with Nvidia’s tailored CUDA libraries for robotics, physics, finance, healthcare, automotive, and more, and you start to see where Nvidia’s heading. They’re not just playing nice with hyperscalers anymore; they’re going straight for enterprise customers.
Why this matters:
Enterprise adoption is still the elephant in the data centre, and up until now, it’s not been clear where the return would come from.
Industry-specific CUDA libraries optimised for Nvidia hardware change that, and make it easier for enterprises to adopt AI into their workflows.
The subsequent shift to inference means actual ROI for hardware owners and data centre operators.
The AI Infrastructure Partnership (AIP) just got even bigger.
I discussed with Larry Fink, Satya Nadella, and Jensen Huang our shared commitment to harnessing AI’s transformative potential to accelerate global growth and drive groundbreaking innovation.
With new partners joining the AI Infrastructure Partnership (AIP), we are expanding our
— Tahnoon Bin Zayed Al Nahyan (@hhtbzayed)
8:20 PM • Mar 20, 2025
NVIDIA and Elon Musk’s xAI have joined the heavyweight group of BlackRock, Global Infrastructure Partners (GIP), Microsoft, and MGX, aiming to drive a colossal $100 billion in AI infrastructure investment. They’re not just talking chips and servers. They’re building AI-ready data centres and energy infrastructure at a scale never attempted before.
The immediate goal?
Raising $30 billion from investors to kickstart the projects, with plans to leverage that into $100 billion through debt financing. With Nvidia acting as the technical advisor and MGX’s Sheikh Tahnoon bin Zayed Al Nahyan calling AI “the industry of the future,” the ambition here is clear:
Turn Abu Dhabi into a global AI powerhouse.
Why this matters:
Abu Dhabi is establishing itself as the AI capital of the Middle East, focusing on energy-efficient infrastructure at scale.
The partnership model is unprecedented, bringing together global investors, hyperscalers, and AI firms to form an ecosystem with massive financial firepower.
GE Vernova and NextEra Energy are also on board, bringing expertise in renewables, gas-fired plants, and nuclear, as without reliable power, all this AI hardware is just metal.
The IPO is happening.
CoreWeave and its investors are seeking to raise up to $2.7 billion in an initial public offering. The Nvidia-backed company is marketing the shares for $47 to $55 each.
Ann Berry, Threadneedle Ventures managing partner, discusses the potential IPO trib.al/RMUdoA2
— Bloomberg TV (@BloombergTV)
7:16 PM • Mar 21, 2025
CoreWeave has officially priced its IPO to raise $2.2 billion, and the games are about to begin. The AI Hyperscaler’s valuation could hit over $35 billion, making it one of the largest public AI infrastructure companies overnight. All eyes are on whether the hype translates into revenue, especially with Nvidia’s GTC announcements raising the stakes.
Why this matters:
$35 billion is a big ask for a neocloud betting it all on AI workloads.
If CoreWeave can deliver on revenue projections, it sets a new benchmark for AI infrastructure providers.
The success of this IPO is at least partially tied to Nvidia’s hardware dominance, and that could yet become a risk in the future.
Perplexity is pushing for an $18 billion valuation.
Perplexity is reportedly in talks to raise up to $1B at an $18B valuation
— TechCrunch (@TechCrunch)
7:18 PM • Mar 20, 2025
This is double its previous $9 billion figure from last November. But it’s not just the valuation that’s making waves. It's how they're planning to get there.
Perplexity is moving beyond search into inference at scale, and Nvidia Dynamo (mentioned above) is possibly the heart of it.
Dynamo boosts inference efficiency by bridging compute, training, and deployment, making large-scale AI deployments cheaper, faster, and less energy-intensive. Given how many times Jensen mentioned Perplexity in the keynote, they’re likely leaning into this technology to sharpen their conversational AI tools and power their new web browser, Comet.
Why this matters:
Dynamo-optimised infrastructure could help Perplexity capture a major share of the AIaaS market.
With AI agents and applications moving beyond training to inference-heavy workloads, this is a direct shot at OpenAI’s dominance.
Multiple direct mentions from Jensen in the keynote are a strong nod of support.
Ultra was the name of the game at GTC 2025.
Introducing NVIDIA Blackwell Ultra — the next evolution of the #NVIDIABlackwell AI factory platform.
Blackwell Ultra sets a new standard in test-time scaling inference and training, paving the way for the age of AI reasoning. #GTC25 ➡️ nvda.ws/4iTBc8I
— NVIDIA Newsroom (@nvidianewsroom)
6:42 PM • Mar 18, 2025
Blackwell Ultra, Rubin Ultra, and even Feynman are now clearly visible on the horizon in the second halves of ‘26, ‘27, and likely ‘28, respectively. And these systems aren’t for training. It’s all about inference and optimising performance per watt to reduce costs and boost deployment efficiency.
We’re going to need a lot of optimisation considering Rubin Ultra is a 600kW rack.
And that’s not all.
NVIDIA are delving into Silicon Photonics to speed up interconnectivity between GPUs, delivering 3.5x power efficiency and 63x better signal integrity. The newly revealed DGX Spark and DGX Station personal AI supercomputers also make high-performance AI infrastructure accessible outside of the data centre.
Why this matters:
Jensen just put every other semiconductor manufacturer on notice, and Team Green is clearly steering the market towards more power = more better.
The market being able to support such high densities is another matter.
These numbers confirm what many have known for a while: power is the bottleneck.
Oracle is throwing $5 billion at the UK cloud market.
Learn more about how we’re supporting the UK Government’s vision for an #AI-driven future: social.ora.cl/60140etHj#CloudWorld
— Oracle (@Oracle)
5:40 PM • Mar 17, 2025
The company plans to open multiple cloud regions across the UK, hoping to challenge the dominance of AWS, Azure, and GCP. This is Oracle’s most aggressive push yet to grab a piece of the enterprise AI market. There’s even rumours they’ll bid for the AI growth zones, though it’s difficult to see how a US company could guarantee the sovereignty these zones require.
Why this matters:
Oracle is finally making a serious play to compete with AWS, Azure, and GCP, and $5B is a big deal for the UK.
Oracle’s existing enterprise relationships with UK customers could give it an edge in AI workloads.
This move may be too little too late as sovereignty and the geopolitical risks of US cloud hosting are top of mind for everyone in the UK right now.
OpenAI’s Stargate infrastructure project is set to house 400,000 Nvidia chips in Abilene, Texas.
The first data center complex for OpenAI’s $100 billion Stargate infrastructure venture will have space for as many as 400,000 of Nvidia’s powerful AI chips.
🔗💻🤖: bloomberg.com/news/articles/…
— Bloomberg Graphics (@BBGVisualData)
6:45 PM • Mar 18, 2025
The 1.2GW site developed by Crusoe aims to be one of the world’s largest AI clusters, with completion expected by mid-2026. Oracle has already committed to utilising the full capacity of the Abilene facility, and OpenAI plans to expand Stargate to 10 sites across the US.
The project’s overall valuation?
A staggering $100 billion.
Why this matters:
If OpenAI can secure 10 sites of this magnitude, it will be one of the largest AI compute networks globally.
With competitors like xAI and Meta also building their own compute clusters, we can likely expect more gigascale clusters in the future.
GTC week is always a spectacle.
Rack densities climb ever higher, booth costs do the same, and everyone is left scratching their heads as to where we’re all going next. And that’s before Abu Dhabi builds an investment partnership like no other, neocloud IPO season begins, Perplexity doubles its valuation in less than 6 months, and Oracle goes hard in the UK and a gigascale stargate opens in Texas.
GTC might feel like a cult, but right now, it’s the only one worth joining.
See you next week.
Good analysis isn’t free. And bad analysis? That’s usually paid for.
I want The GPU to stay sharp, independent, and free from corporate fluff. That means digging deeper, asking harder questions, and breaking down the world of GPU compute without a filter.
If you’ve found value in The GPU, consider upgrading to a paid subscription or supporting it below:
☕ Buy Me A Coffee → https://buymeacoffee.com/bbaldieri
It helps keep this newsletter unfiltered and worth reading.
Reply