Latency kills experiences.

A single spike can turn gamers into rage-quitters. Finance feeds freeze. Remote surgery stalls.

In 2014, a founder obsessed with zero lag wired together his own backbone to fix this problem.

Tokyo to Buenos Aires, Frankfurt to Johannesburg, because incumbent CDNs couldn’t keep up.

The team successfully killed gaming lag.

And today, they’re already several years into the next battle.

Vector search. Vision pipelines. Frontier model development. LLM inference. Real-time AI.

These workloads demand sub-30 ms tail latency and robust cloud infrastructure.

And this company is in prime position to meet that demand.

Who are they?

The GPU Audio Companion Issue #54

Want the GPU breakdown without the reading? The Audio Companion does it for you—but only if you’re subscribed. If you can’t see it below, click here to fix that.

Company Background

A decade ago, most networks were built to push static web pages. 

User demands were low. Latency was an afterthought. Lag wasn’t even on anyone’s radar. Network performance was generally sufficient, and that was enough.

But then came the online gaming revolution.

And where standard netizens were forgiving, online gamers were not. 

Static web pages yield consistent network demands. Online gaming is the polar opposite. A few milliseconds of delay is often the difference between victory and defeat. And if the players don’t get what they want, and latency spikes? They rage-quit, and that’s bad for everyone.

The market need was clear: game developers and publishers needed high-performance, low-latency infrastructure on a global scale.

They understood the needs and desires of this highly demanding, new user group on a far deeper level than the incumbents. They knew their competitor CDNs wouldn’t be able to keep up. So they began by wiring together their own backbone from London to Mumbai, Seattle to Dubai.

The project was called Gcore: Gaming at the Core

Fast-forward.

The same fabric now stretches to 210 PoPs on six continents. Shifts 200 Tbps of traffic. Peers with 14,000 networks. Averages 30ms round-trip time. And delivered by 550 people across eight offices, all working on one mission: kill latency before it kills the experience. Turns out that such a singular focus goes far beyond just gaming.

Gcore uses the same network to deliver low-latency solutions to finance, education, retail, healthcare, telecom, large enterprises, governments, and SMEs worldwide.

From Gaming to AI

Foundational models. Fraud detection. Real-time vision systems. Every new AI workload is an online gaming problem in disguise: enormous east-west traffic, ruthless tail latency targets, and users who abandon the service the moment the experience drags.

So Gcore leaned in.

Instead of bolting expensive GPUs onto a legacy CDN, they folded compute into the fabric itself.

Edge AI was born. Bleeding-edge NVIDIA GPU clusters sitting inside the existing PoPs. InfiniBand-linked, where the densities justify it. Ethernet-optimised everywhere else. 

And to tie it all together, three-click AI deployment with Everywhere Inference, which lets a developer drop a model into production, see it deploy within 10 seconds, and have it execute within 30 ms for any user on the planet.

Training, fine-tuning, and deployment happen in the same network, in the same UI. 

Need burst capacity for a week-long run? Spin up nodes in Incheon and Helsinki. How about sovereign boundaries for sensitive healthcare data? Pin the workload to Chester or Sines, and let the policy engine enforce locality. Security? The DDoS shield that protects MMO launch days now also shields AI endpoints, scrubbing attacks before they reach the model.

It’s a proven playbook, written for gamers, rewritten for AI.

The “G” in Gcore, therefore, no longer stands just for gaming.

It stands for Global, GPU-ready, and Guaranteed speed. Because real-time AI is the killer app. And it demands a network built to win latency wars from day one.

Executive Team

The Edge

Recent Moves

What’s Next?

Gcore’s buildout for the remainder of 2025 and beyond is accelerating, and it’s moving north, south, and straight to the edge. 

The Chester cluster with 2,000 H200 GPUs and NVIDIA’s latest Blufield-3 DPU has just gone live. This brings much-needed sovereign UK capacity online just as demand for local GenAI fine-tuning is beginning to spike. Coupled with the recently deployed Sines-2 and Sines-3 clusters in Portugal, Gcore is well-positioned to service European customers at a time when the EU AI Act will alter market dynamics. 

Perfect for teams that need compliant infrastructure today. 

But those sites are only the front door to a broader platform shift. 

Gcore will soon offer customers a hyperscaler AI experience, both on-premises and in the cloud, with a three-click deployment for both training and inference. Still serverless, still simple, but now bundling JupyterLab workspaces, SLURM schedulers, and Kubeflow pipelines to effortless AI deployment with Everywhere Inference. It’s the same UI, the same PoP footprint, now delivering training and inference in just three clicks.

Underpinning it all is the vertically integrated stack born of the Gcore-Northern Data partnership:

The Intelligence Delivery Network (IDN).

The project, announced at GTC San Jose 2025, launches imminently. The idea is simple: tokenised (pre-optimised, ready-to-serve) models, deployed via Gcore Everywhere Inference 3-click serverless AI deployment, running on thousands of Northern Data GPUs spread across multiple locations, to ensure the best possible latency for real-time AI. This will be the first service of this kind to be delivered by a European player. Couple this with Northern Data’s option to acquire a majority stake in Gcore, and there’s potential for much deeper integration, both up and down the stack.

Data centres. Hardware. Network. Software. Diversified revenue streams. A broader client base. Massive scale. 

All working together to protect margins while speeding future PoP roll-outs, engineering a turnkey backbone for enterprises that demand secure, high-performance AI from Europe to the Americas, Africa to APAC. 

And with Prof. Dr. Feiyu Xu and Dr. Philipp Roesler joining the board to supercharge R&D, and thousands of GPUs already online, the strategy is clear:

Lock down GPU supply, bolster Europe’s sovereign compute capacity, and double down on AI innovation.

Then?

Turn Everywhere Inference into the compliant intelligence layer for Europe’s sovereign AI boom.

And if the next wave of AI really does come down to whoever ships a token in ≤ 30 ms, Gcore already has its hand on the latency dial.

The only real question is whether the rest of Europe’s AI stack can keep pace before users feel the lag.

Reply

or to participate