Discussion about this post

User's avatar
John Allard's avatar

The most fun part about working at a place like OpenAI is seeing how wrong random substackers are when making claims about training and serving frontier models. The worst part about working at OpenAI is not being able to correct people without divulging inside knowledge.

I’ll say this: frontier LLM providers are running fantastic businesses. Your math is very wrong

Expand full comment
Christopher Toth's avatar

This analysis doesn't pass the smell test. They claim 90% subsidization based on $6.37/million tokens "true cost", but inference providers like Together AI and Fireworks profitably serve 70B models at $0.90-2.00/million.

The math assumes pathetically low throughput (1,848 tokens/sec on 8x H200s) and uses cloud pricing instead of actual hardware costs. Modern serving stacks with continuous batching and quantization achieve much higher utilization.

The timing is suspicious - OpenAI just dropped o3 prices 80% last week. If they were already subsidizing 90%, they'd now be at 98% losses, which is absurd.

Note the author is selling a "Token Optimization" workshop at AgentCon. Classic FUD marketing: create panic about future price hikes, position yourself as the expert, sell the solution.

Unlike Uber or AWS, LLMs are basically interchangeable - there's no lock-in. Why would providers heavily subsidize a commodity service where customers can switch with a one-line code change? The margins are probably thin but positive, especially at scale.

If the economics were truly this dire, we'd see inference providers shutting down, not OpenAI aggressively cutting prices further.

Expand full comment
1 more comment...

No posts