- The GPU
- Posts
- Issue #12: The Lambda Labs Profile
Issue #12: The Lambda Labs Profile
Delivering the Lowest-Cost AI Inference, Anywhere

Right now, the industry optimises for training.
Neoclouds fight over access to GPUs for massive training clusters while targeting a small group of customers: AI labs and startups. A frontier lab’s edge is access to the latest and greatest hardware, and they’re willing to pay top dollar to purpose-built AI clouds with deep in-house expertise and a niche focus.
That means these contracts are incredibly lucrative.
But they’re also a one-time thing.
Once those models get built and the contract ends, the economics shift. The model is trained, and it’s time to use it. This is inference.
Then it’s no longer about /GPU/h charges but the cost per million tokens.
Think Xerox copier model for public endpoints, but updated for the AI era.
The inference market is small now, but it’s starting to grow. And as adoption accelerates, inference will be where the real money is made. Enterprises embedding AI into their workflows and finetuning models on their data, startups rolling out AI-driven products, and platforms serving real-time LLMs all need inference at scale at the lowest possible cost.
One company is positioning itself for this coming wave.
Their bet? Make inference cheap and fast. Cheaper and faster than anyone else.
If they’re right, it won’t be model training that defines the next phase of AI cloud competition.
It’ll be who delivers the lowest cost per query.
Welcome to Lambda Labs.
The GPU Audio Companion Issue #12
Want the GPU breakdown without the reading? The Audio Companion does it for you—but only if you’re subscribed. Fix that here.
Company Background
Lambda Labs didn’t jump on the AI bandwagon. They built it.
Founded in 2012, Lambda started with a simple idea: make AI infrastructure cheaper. After experiencing sky-high AWS bills running AI models, they realised there had to be a better way.
Instead of fighting hyperscalers on their terms, Lambda focused exclusively on GPUs.
They started with deep learning. Realising they needed more compute to work effectively, they moved on to building workstations. The deep learning community caught wind of this, and it wasn’t long before they were selling those same workstations or making them accessible through an API. From there it was expansion into dedicated AI servers, private cloud deployments, and, eventually, their own AI cloud platform.
Fast-forward to 2024, and Lambda Labs has raised a huge amount of money to expand its cloud footprint. Its infrastructure spans colocation data centres in Utah, San Francisco and Texas, plus select countries like Australia and Japan, and it plans to expand rapidly in 2025. While many GPU cloud providers chase AI training workloads, Lambda sees the bigger picture.
They’re investing in inference, betting that cost-efficient AI deployment will become the real business model for the next decade.
True believers: Lambda Labs’ AI cloud dreams dlvr.it/TFCG9s
— DCD (@dcdnews)
3:46 PM • Oct 10, 2024
Executive Team
Lambda’s leadership is a mix of AI researchers, cloud veterans, and product experts:
Stephen Balaban (CEO & Co-founder): Built Lambda from a niche AI hardware company into a full-scale cloud provider.
Michael Balaban (Co-founder & CTO): Manages Lambda’s cloud infrastructure and platform engineering.
Peter Seibold (CFO): Oversees finance and critical process improvements.
Robert Brookes IV (VP of Revenue): Oversees sales and marketing for Lambda.
Thomas Bordes (Head of Marketing): Gets Lambda in front of the right people in the right way at the right time.
The Edge
Lambda’s competitive advantage comes down to cost, flexibility, and full-stack AI solutions.
One-Click Clusters: Developers can launch fully optimised, multi-node (16-512 GPUs) AI clusters with Infiniband networking in a single click. That means no dealing with custom instance types or complex networking setups. Just GPUs, optimised for AI, running immediately, and rentable for a single week.
On-Demand & Private Cloud: Lambda offers on-demand AI cloud for startups and enterprises that need flexible compute. For companies needing dedicated resources, Lambda’s private cloud solution delivers fully managed AI infrastructure - either in Lambda’s data centres or on-prem.
Lowest-Cost AI Inference: Lambda’s inference API undercuts the market, delivering 5x cost savings over AWS and other hyperscalers. They do this through custom-built inference hardware stacks optimised for cost and latency, direct API access, and pre-tuned AI models to reduce deployment complexity.
AI Workstations & Professional Services: Lambda sells the entire AI stack. From high-performance workstations for local model development to fully managed enterprise deployments, Lambda handles everything from R&D to production AI deployment. For AI-first enterprises, Lambda’s professional services team delivers custom AI infrastructure design, enterprise-scale AI cloud migration, and model training and optimisation consulting.
Tired of infrastructure drama when deploying AI? 🙄 Say hello to the Lambda Inference API! Effortless scaling, wallet-friendly pricing, no hidden fees and no rate limits!
Built for devs who want results, not headaches. What will you build with it?
— Lambda (@LambdaAPI)
12:15 AM • Dec 13, 2024
Recent Moves
$500M Funding Round: Lambda raised $500M in April 2024 to expand its AI cloud, and was reportedly hunting for “another $800m” in July.
Inference-as-a-Service API Launch: Lambda introduced its low-cost inference platform, targeting AI-native businesses looking to reduce inference costs dramatically.
Pegatron Partnership for GB200 NVL72 Racks: Lambda partnered with Pegatron to deploy NVIDIA GB200 NVL72 rack systems, expanding its AI training and inference capabilities with top-tier GPU clusters.
AI Research Grants Program Expansion: Lambda increased funding for its AI research grants program, offering free cloud credits to researchers and startups to accelerate machine learning innovation.
New Invoice & Billing Portal: Lambda rolled out a self-serve invoice system to simplify enterprise billing and procurement processes.
COO Departure: Lambda’s COO left to head Positron, a startup developing alternative AI hardware to compete with NVIDIA.
Industries will be disrupted. You will be the disruptor. Pegatron just delivered their first NVIDIA NVL72 GB200 rack to Lambda 🔥
— Lambda (@LambdaAPI)
2:15 PM • Jan 16, 2025
Check out the Lambda Deep Learning Blog for the latest updates.
What’s Next?
Lambda’s betting that AI inference will become bigger than AI training.
Training models costs billions, but running them at scale will generate trillions. The market needs cheaper, faster inference infrastructure, and Lambda wants to own that space before hyperscalers react. With $500M in fresh funding, next-gen GH200 & GB200 hardware, and an aggressive push into inference APIs, Lambda is positioning itself as the most cost-effective AI cloud provider for startups and enterprises alike.
But the real question?
Will optimised pricing alone be enough to compete with the hyperscalers? Or will the market expect more?
One thing’s certain: inference is the future of AI infrastructure.
And Lambda plans to be the company powering it.
Keep The GPU Sharp and Independent
Good analysis isn’t free. And bad analysis? That’s usually paid for.
I want The GPU to stay sharp, independent, and free from corporate fluff. That means digging deeper, asking harder questions, and breaking down the world of GPU compute without a filter.
If you’ve found value in The GPU, consider upgrading to a paid subscription or supporting it below:
☕ Buy Me A Coffee → https://buymeacoffee.com/bbaldieri
It helps keep this newsletter unfiltered and worth reading.
Reply