May 2026 gave the AI cost debate its defining anecdotes. One company reportedly spent half a billion dollars on AI in a single month after failing to set usage limits. Uber burned through its annual AI budget in four months. Microsoft canceled most of its internal Claude Code licenses, partly over cost. And in a recent survey, nearly 70% of executives said they're prepared to cut AI spend this year if results don't show.
The correction is real, and parts of it are healthy. But "AI is too expensive" is a symptom presenting as a diagnosis. Treating the symptom - cut the budget, freeze the pilots - feels decisive and fixes nothing, because the sentence doesn't say what's actually broken.
Two cost problems, one label
Look closely at the sticker-shock reporting and you find two different cost problems wearing the same headline.
The first is assistant sprawl. Per-seat AI tools, agent experiments, internal leaderboards that rewarded token burn. One CTO told reporters his employees were using frontier models to check the weather. This is a governance problem: nobody set limits, nobody routed cheap tasks to cheap models, nobody asked what the spend was for. It produces spectacular bills and spectacular headlines. It's also the easier problem - usage caps, model routing, and a clear policy claw most of it back within a quarter.
The second is product inference economics. You shipped an AI-powered feature, and now every customer interaction carries marginal model cost. This problem doesn't show up in pilots, because pilots run at volumes where everything is affordable. It shows up at scale, where the cost curve turns out to be linear and the revenue model quietly assumed it would flatten.
Different mechanisms, different owners, different fixes. Averaging them into one sentence produces panic instead of decisions. When a board reads a half-billion-dollar anecdote about ungoverned employee usage and responds by freezing a product team's inference budget, the wrong problem just got the treatment.
Why product AI costs actually blow up
In the AI products I've seen fail on economics - and in the ones I've watched survive - the cost problem traced back to one of three decisions made long before the bill arrived.
The unit of value was never defined. Spend was approved against a capability ("we'll add an AI assistant to the platform") rather than a unit ("we'll resolve a support ticket for 0.40 instead of 4.00"). If nobody can say what one unit of value costs, nobody can say whether the spend is high. Too expensive relative to what?
Pilot economics were extrapolated, not modeled. A pilot with 50 users and curated data tells you almost nothing about cost at 5,000 users with production data. Context grows: retrieval pipelines inject thousands of pages per query, agentic workflows multiply model calls per task. And falling token prices won't save you. Per-token prices dropped by orders of magnitude between 2023 and 2025; total enterprise inference spend rose anyway, several times over. Cheaper tokens mean more tokens, not lower bills. The only protection is modeling cost per unit at production volume before committing - and that's a product exercise, not a procurement one.
Nobody owned the number. The bill lands in IT or cloud spend. The value lands in a business unit's metrics. No single person sees both. When cost per outcome has no owner, it gets discovered instead of managed - usually by a CFO, usually late, usually as a crisis.
Notice what's not on the list: the model. Model choice is the variable teams obsess over and almost never the one that decides the economics. Integration depth, retrieval design, evaluation, fallback behavior - that's where the money goes, and that's where it's saved.
The counterweight
None of this means the spend doesn't pay. Where AI is pointed at concentrated, measurable workflows - fraud detection, supply chain exceptions, support resolution - the returns are documented and, in the best cases, boring in their reliability. The price curve genuinely helps too: inference prices for a fixed level of capability have been falling at rates with few precedents in computing history. A feature that's marginally uneconomic today may clear comfortably in 18 months - if its consumption doesn't grow faster than its prices fall, which is exactly the variable you have to model rather than hope about.
The ROI critique itself also deserves one skeptical look. A meaningful share of "no measurable return" is in fact "nobody measured." Surveys keep finding that most organizations can't quantify returns on technology investments in general, not just on AI. That doesn't rescue bad spend, but it does mean "ROI isn't materializing" and "ROI isn't visible" are different findings with different fixes.
So the defensible positions are narrower than the discourse suggests. "Cut AI spend" is sometimes right. "Keep investing through the noise" is sometimes right. Undifferentiated spend - approved without a unit, extrapolated without a volume model, owned by no one - is the only position that is always wrong.
Five questions before you approve, renew, or cut
This is the frame I use when someone asks whether an AI line item is too expensive. It works equally well for defending a budget and for killing one.
- Is this a seat cost or a product cost? Tools for your employees and inference inside your product have different economics, different owners, and different fixes. Sprawl needs caps and routing. Product cost needs design work. Decide which conversation you're in before having it.
- What's the unit? Cost per what - per resolved ticket, per processed document, per generated quote? If nobody in the room can name the unit, the spend isn't governed. It's a hope with a budget attached.
- What happens at 10x volume? Model it: context growth, calls per task, retries, the long tail of expensive edge cases. A linear cost curve meeting a flattening revenue curve is the most common quiet killer of AI features.
- Who owns the number? One named person who sees the bill and the value in the same view, with the authority to change the design when the ratio drifts. Not a committee, not a dashboard nobody opens.
- What does being wrong cost? Models fail. The fallback - human review, escalation, rework - is part of the unit economics, not an asterisk. If the answer is "it will be right most of the time," the economics aren't done.
Five questions, an afternoon of work, and most of the sticker shock becomes either a fixable governance gap or a product decision you can actually make.
The line
The companies in the sticker-shock headlines didn't have an AI cost problem. They had unpriced AI: capacity deployed without units, pilots scaled without volume math, bills accumulating without owners. Cost in an AI product is a design decision. Treat it as a procurement line item, and you will keep getting procurement-sized surprises.
Dmitry Borodin co-founded B Productive, a boutique AI product advisory helping B2B companies turn AI pilots and product bets into shipped products.
Sources
- Axios, "AI sticker shock hits corporate America," 28 May 2026 - axios.com
- Fortune, on the end of "tokenmaxxing," 28 May 2026 - fortune.com
- G-P, 2026 AI at Work Report
- Epoch AI, "LLM inference prices have fallen rapidly but unequally across tasks" - epoch.ai
- MIT NANDA, "The GenAI Divide: State of AI in Business"