- The GPU
- Posts
- Issue #29: The Open-Source AI Acceleration Cloud with Together AI
Issue #29: The Open-Source AI Acceleration Cloud with Together AI
Open, transparent AI that you can take with you wherever you go.

The story of cloud is typically a story of vendor lock-in.
Enticing credit programs. “Transparent” pricing. Complex deployments.
Once you’re in, good luck getting out.
This company doesn’t play that game.
They’re going all-in on open source. No walled gardens, no sneaky pricing. Just open, transparent AI that you can take with you wherever you go. And they’re building the ultimate developer-friendly, purpose-built AI cloud.
One that’s research-driven, massively scalable, and remarkably cost-efficient.
And it’s not just marketing fluff.
Their inference engine outperforms AWS, Azure, and GCP equivalents, and it’s 11x cheaper than GPT-4 and 4x faster than vLLM.
That’s not just good. It's a challenge.
Who are they?
Welcome to Together AI.
The GPU Audio Companion Issue #29
Want the GPU breakdown without the reading? The Audio Companion does it for you—but only if you’re subscribed. Fix that here.
Company Background
Together AI isn’t your typical cloud platform.
It’s “The AI Acceleration Cloud”, built by a team of Stanford AI researchers and the group that scaled Apple’s Siri from the ground up.
The result? Full optimisation of every layer of the modern AI stack: hardware, software, and services.
They’re not just building tools either. They’re building the underlying infrastructure that will power the next wave of open-source innovation.
Their mission?
Make AI open, transparent, and community-driven.
Models. Data. Control. All in the hands of the user.
This is a serious deviation from the usual story of vendor lock-in, FinOps, and opaque end-of-month bills.
One that makes sense when you consider Together AI’s core principles:
Open Source First: No lock-in, full transparency, and community collaboration.
Developer Empowerment: Your models are yours. Train, fine-tune, and deploy without worrying about vendor lock-in.
Cost Efficiency: Together’s inference is 11x cheaper than GPT-4 while delivering the same level of performance.
Performance Matters: Optimised inference with FlashAttention-3, custom-built optimised kernels, and advanced speculative decoding.
Executive Team
Together AI is led by a team of AI visionaries and engineering experts:
Vipul Ved Prakash (Founder & CEO): Serial entrepreneur with a knack for building transformative tech.
Ce Zhang (Founder & CTO): AI researcher and core systems expert, leading product and engineering.
Chris Ré (Founder): Stanford professor, pioneer in machine learning and data processing.
Percy Liang (Founder): LLM specialist, pushing the boundaries of generative AI.
Tri Dao (Founding Chief Scientist): Developer of cutting-edge model architectures and training methods.
Kai Mak (Chief Revenue Officer): Commercial strategy and growth mastermind.
Rajan Sheth (Chief Marketing Officer, Interim)
Ryan Pollock (Director of Product Marketing): Communicating Together AI’s value proposition to the market.
Jamie de Guerre (Founding SVP Product): Product vision and strategy lead.
Nicolette Lea (Director People & Ops): Building the team and culture.
Arielle Fidel (VP Sales and BD): Scaling partnerships and enterprise deals.
Sarung Tripathi (VP, Customer Experience): Keeping users happy and engaged.
Charles Srisuwananukorn (Founding VP Engineering): Leading the technical execution.
Leon Song (VP of Research): Pushing research boundaries in AI and ML.
Albert Meixner (SVP of Engineering): Overseeing cloud and infrastructure development.
The Edge
Together AI’s edge comes from one simple fact: they do it all themselves.
End-to-End Control: From custom-built inference engines to fully managed AI clusters.
Open-Source Dominance: 200+ models at your fingertips. No walled gardens, no lock-in.
Flexible Deployment: Serverless endpoints, dedicated instances, and enterprise setups.
Transparent Pricing: No hidden fees, no surprises. Pay for what you use, scale as needed.
Recent Moves
Together GPU Clusters accelerated by NVIDIA Blackwell platform: Unveiled the deployment of NVIDIA Blackwell GPUs, launching their new Instant GPU Clusters that deliver up to 64 NVIDIA GPUs per deployment, entirely self-service.
NVIDIA Cloud Partner Status: Together AI joined the NVIDIA Cloud Partner Network, unlocking early access to Blackwell, and enabling the deployment of a 36,000 GPU cluster featuring the GB200 NVL72s, backed by 200MW+ of data centre capacity.
$305 Million Series B Funding: Together AI raised $305 million in a Series B round led by General Catalyst and Prosperity7, and participation from Salesforce Ventures, NVIDIA, DAMAC Capital, Kleiner Perkins, and more.
DeepSeek-R1 and Reasoning Clusters: Ranked amongst the fastest serverless API providers for the DeepSeek-R1 model, as measured by Artificial Analysis, and has recently launched Reasoning Clusters, delivering dedicated infrastructure for reasoning model inference at scale.
Read more about Together AI’s recent moves on their blog, here:
What’s Next?
Together AI is a statement against closed, proprietary ecosystems.
And with plans to massively scale NVIDIA GB200 NVL72 and HGX B200 capacity and ship upcoming reasoning models, including the expected Llama 4, atop the NVIDIA Blackwell platform, they’re doubling down on their commitment to accelerating AI with open-source innovation.
Why?
Together AI wants to provide every business and developer with a fully optimised platform for building and running their own AI with complete, end-to-end control.
All with no strings attached, no lock-in.
Just pure, unadulterated AI training and inference power, at a fraction of the cost of the hyperscalers.
Because why settle for walled gardens when you can own the whole field?
Keep The GPU Sharp and Independent
Good analysis isn’t free. And bad analysis? That’s usually paid for.
I want The GPU to stay sharp, independent, and free from corporate fluff. That means digging deeper, asking harder questions, and breaking down the world of GPU compute without a filter.
If you’ve found value in The GPU, consider upgrading to a paid subscription or supporting it below:
☕ Buy Me A Coffee → https://buymeacoffee.com/bbaldieri
It helps keep this newsletter unfiltered and worth reading.
Reply