The balance of power continues to shift.

Not from a single shock, but from a series of moves that all point in the same direction: long-standing advantages are no longer guaranteed. Some of the biggest players in AI are feeling the pressure for the first time, and the strain?

It’s starting to show.

I’m Ben Baldieri. Every week, I break down what’s moving in GPU compute, AI infrastructure, and the data centres that power it all.

Here’s what’s inside this week:

Let’s get into it.

The GPU Audio Companion Issue #77

Want the GPU breakdown without the reading? The Audio Companion does it for you, but only if you’re subscribed. If you can’t see it below, click here to fix that.

OpenAI Hits ‘Code Red’ as Google Gains

OpenAI has triggered a full “code red” inside the company.

According to the Wall Street Journal, Sam Altman has told teams to pause other work and focus solely on improving ChatGPT’s speed, reliability, and user experience. Google’s new Gemini 3 model has moved ahead of OpenAI on key benchmarks, and its user base is climbing fast. Anthropic is also tightening the gap with Opus 4.5.

For the first time since 2022, OpenAI looks less like the default and more like one of several strong options.

The urgency is real.

Internally, Altman has pushed back on advertising products, agent pilots, and the planned Pulse assistant. Teams are being shifted toward ChatGPT. There’s now a daily call for the people responsible for improving the product.

The company says it will debut a new reasoning model next week that it believes is ahead of Google’s latest release.

But, per the WSJ, the real pressure is structural.

OpenAI is private, unprofitable, and dependent on continuous fundraising to support its data centre buildout. Google funds AI from core revenue. OpenAI burns capital. Google is training on its own chips, shipping models faster, and rolling them directly into core products. OpenAI needs ChatGPT to grow faster, retain users for longer, and convert more of its 800m weekly users into paying accounts.

Why this matters:

  • Google is proving its integrated model works: in-house chips, large-scale training, and rapid product deployment running in one loop.

  • OpenAI has to defend a leadership position from a weaker financial posture, with more reliance on external capital, and less control over the hardware stack.

  • Google’s momentum signals a larger shift (that’s echoed by the AWS re:Invent announcements below): the next phase of AI may reward platforms with full-stack control, not standalone labs fighting for margin and distribution.

HSBC Signs Deal with Mistral for Internal AI Rollout

HSBC has signed a multi-year partnership with Mistral to weave the French model maker’s systems into its global operations.

According to the HSBC announcement, Mistral will power HSBC’s productivity stack. That means drafting client communications, speeding up financial analysis, reducing paperwork in lending workflows, improving translation and reasoning across languages, and helping teams prototype and ship new processes faster. Future phases will then push deeper into customer-facing work, including onboarding, fraud checks, and credit processes.

Through this deal, HSBC gains a controllable model layer that meets its security and data-ownership requirements, and Mistral lands credibility as a viable alternative for large enterprises seeking frontier-level capability without ceding control.

But that’s not all from the French AI lab this week.

The company also rolled out their latest model, Mistral 3. The announcement shows “frontier class” benchmarks relative to the open source MoE competition, and deliberately highlights "custom model training services”. That line viewed in the context of the HSBC deal could provide a hint as to the direction of travel for 2026.

Why this matters:

  • Compliance and governance are hot-button issues for AI deployments in regulated industries like financial services. Open-source models are potentially a better fit here because they can give buyers more control over their stack, removing some of the proprietary, black-box issues inherent to closed-platform ecosystems.

  • Given HSBC's status as a marquee customer and the inevitable competition the bid for this project must have faced, Mistral’s model-licensing strategy is clearly a winning approach, resonating with institutions that need tight data governance and predictable costs.

  • If the project succeeds, and open-source models like Mistral 3 continue to apply competitive pressure on their closed-source US counterparts, it’s not unlikely we could start to see the rest of the industry tilt further in this direction.

Carbon3.ai Launches the UK’s First ‘Private AI Lab’

The UK now has a sovereign AI lab programme built to take enterprises from zero to a production-ready AI deployment in two weeks, and it’s free for those who qualify.

Carbon3.ai and HPE have launched the Private AI Lab. Per the announcement, it’s a fully funded, UK-hosted AI adoption track powered by renewable energy and built on HPE Private Cloud AI, co-developed with NVIDIA. The pitch is simple: stop enterprises from stalling at the proof-of-concept stage and give them a governed, compliant, production-grade use case on a tight timeline.

The offer includes a validated architecture, ROI modelling, governance templates, and a ready-to-run workflow built on sovereign infrastructure.

And to top things off, everything sits inside a UK-only, 100 percent off-grid power environment using NVIDIA AI systems and the NVIDIA AI Enterprise software stack.

Why this matters:

  • Carbon3.ai frames this as the missing bridge between the UK's world-class research output and its comparatively slow enterprise deployment.

  • At the same time, HPE’s involvement means enterprises get a real production platform, not just a startup sandbox.

  • Add zero cost, sovereignty, and renewable power, and you have a much lower-risk way for enterprises to get involved with AI in a meaningful way.

DeepSeek V3.2 pushes hard into agent tooling

DeepSeek has shipped V3.2 and V3.2-Speciale, its new reasoning-heavy models aimed straight at the agent stack.

V3.2 becomes the new daily driver: fast, efficient, and tuned for long-context work.
V3.2-Speciale is the high-compute variant built for hard reasoning, scoring gold-level results across IMO, CMO, IOI, and ICPC benchmarks. It runs API-only, no tool calls, and is positioned as a direct challenger to top-end frontier models.

The headline upgrade is the way DeepSeek handles agent workflows.

A new agentic data-synthesis pipeline generates large-scale tool-use and environment-interaction data, letting the model “think inside” tool chains rather than bolting reasoning on afterwards.

V3.2 integrates this natively, including a revised chat template and improved long-context performance through DeepSeek Sparse Attention. Both models are open source, with full weights, Olympiad solutions, and technical reports released publicly.

Why this matters:

  • DeepSeek keeps pressing its advantage in open-weight, high-reasoning systems at a time when the market is watching alternatives to the top US models.

  • Agent workloads are shifting toward long-running, tool-heavy pipelines, and V3.2 is tailored for that shift.

  • Open release of high-compute variants adds competitive pressure on both closed models and open frontier efforts, reinforcing the trend toward more capable open systems hitting production-grade use cases.

Micron Exits Consumer Memory to Pursue AI

Micron is pulling the plug on its consumer memory business and shifting all attention to HBM for AI data centres.

The company will wind down sales of its Crucial-branded consumer products through early 2026. The unit wasn’t a significant revenue driver, but cutting it clears the way for Micron to focus entirely on advanced memory, the one segment experiencing real scarcity and real margins.

And scarcity is the word.

HBM supply chains are tight across the board.

Samsung, SK hynix, and Micron are all running flat out. AI demand keeps pulling the market faster than anyone expected. Micron’s own HBM revenue hit nearly $2bn in the August quarter, putting it on an $8bn annualised run rate. With numbers like that, and no sign of the supply squeeze abating anytime soon, it’s hard to argue with the business case behind the shutdown.

Why this matters:

  • HBM is one of the most strategic components in the AI stack, with cost, availability, and packaging now directly shaping who can build frontier clusters.

  • With three suppliers battling for dominance, the move marks Micron’s clearest signal yet that it wants a bigger share of the AI memory pie, and is willing to sacrifice entire business lines to get it.

  • Crucial was a household name in the gaming space, a sector already under pressure from spiralling GPU costs. This shutdown is likely to further increase that pressure. And just in time for Christmas. Spare a moment for anyone hoping for a new gaming rig.

AWS re:Invent lands a mountain of AI upgrades

AWS turned re:Invent into a full-stack AI showcase this year.

Nova 2 Sonic arrives with speech-to-speech interaction across languages. Nova 2 Lite becomes the cheaper reasoning model for everyday workflows. Nova 2 Omni moves into multimodal terrain. And Nova Forge lets enterprises train their own frontier models directly on Amazon’s stack, handing them customisation without the usual cost and complexity.

But that’s not all.

Amazon also pushed deeper into agent automation.

Nova Act now ships at over 90 percent reliability for UI-based task automation. Bedrock’s AgentCore gains stronger policy controls. S3 Vectors hits GA with two billion-vector indexing and much lower costs. And SageMaker adds checkpointless and elastic training to help labs scale without interruptions.

And below the model layer?

AWS pushed forward with custom solutions.

Graviton5 steps in as the new general-purpose CPU. Trainium3 UltraServers move to 3nm and target cheaper, faster training. And Lambda Managed Instances merge serverless ergonomics with EC2-level control.

The message is simple:

Amazon wants to be the enterprise AI operating system, from chips to models to managed agents.

Why this matters:

  • AWS is leaning into its advantage: a full-stack platform where hardware, models, and orchestration live under one roof.

  • Nova Forge is the clearest sign yet of a push towards custom frontier models becoming mainstream inside large enterprises.

  • Trainium3 and Graviton5 continue Amazon’s slow line-cutting strategy against NVIDIA, especially for cost-sensitive training.

Deutsche Telekom & Schwarz Group’s Sovereign Build

Germany wants a flagship AI data centre on its own soil, and two heavyweights are stepping up.

Brookfield is reportedly backing the effort, adding financial muscle to what could become one of Europe’s largest AI infrastructure projects.

Telekom has already signalled its intent to see Germany take a leading position in the Gigafactory programme, and this push follows its recent collaboration with NVIDIA on an “industrial AI cloud” in Munich.

Schwarz is already deep in the game too. Through StackIT, the group is building out hardened cloud and AI capacity for European enterprises, and it recently broke ground on a 200MW facility in Lübbenau. An EU-funded site would be another step toward a sovereign AI footprint that doesn’t rely on US hyperscalers.

Why this matters:

  • Europe wants home-grown AI supercentres, and Germany is making a play to anchor one of the first.

  • Europe also lacks a counterweight to US neocloud dominance after the recent Northern Data sale to Rumble.

  • This announcement, viewed in the context of Brookfield’s recent launch of Radiant, could therefore potentially be part of the global investment firm’s broader European strategy in a market now ripe for the taking.

The Rundown

Control over the stack is becoming the defining advantage in AI, and this week made that impossible to ignore.

OpenAI hit “code red” because it doesn’t control its stack. Google surged ahead because it does. AWS doubled down with a barrage of full-stack releases to keep its lead. Micron walked away from consumer memory to focus on HBM, the layer that now decides who can even build frontier clusters. Carbon3.ai launched a sovereign lab that gives enterprises controlled compute on day one. DeepSeek kept showing that open-weight doesn’t mean powerless; if you own your training pipeline and release cycle, you still shape the direction of travel. And Germany’s gigafactory bid is a straight attempt to pull strategic infrastructure back onto home soil.

All of it points one way:

Stack control is shifting from advantage to moat.

And the split between those who have it and those still renting it is starting to show.

See you next week.

Everything Else

Reply

or to participate