AI isn't too expensive. Your AI is unpriced.

May 2026 gave the AI cost debate its defining anecdotes. One company reportedly spent half a billion dollars on AI in a single month after failing to set usage limits. Uber burned through its annual AI budget in four months. Microsoft canceled most of its internal Claude Code licenses, partly over cost. And in a recent survey, nearly 70% of executives said they're prepared to cut AI spend this year if results don't show.

The correction is real, and parts of it are healthy. But "AI is too expensive" is a symptom presenting as a diagnosis. Treating the symptom - cut the budget, freeze the pilots - feels decisive and fixes nothing, because the sentence doesn't say what's actually broken.

Two cost problems, one label

Look closely at the sticker-shock reporting and you find two different cost problems wearing the same headline.

The first is assistant sprawl. Per-seat AI tools, agent experiments, internal leaderboards that rewarded token burn. One CTO told reporters his employees were using frontier models to check the weather. This is a governance problem: nobody set limits, nobody routed cheap tasks to cheap models, nobody asked what the spend was for. It produces spectacular bills and spectacular headlines. It is also the easier problem - usage caps, model routing, and a clear policy claw most of it back within a quarter.

The second is product inference economics. You shipped an AI-powered feature, and now every customer interaction carries marginal model cost. This problem doesn't show up in pilots, because pilots run at volumes where everything is affordable. It shows up at scale, where the cost curve turns out to be linear and the revenue model quietly assumed it would flatten.

Different mechanisms, different owners, different fixes. Averaging them into one sentence produces panic instead of decisions. When a board reads a half-billion-dollar anecdote about ungoverned employee usage and responds by freezing a product team's inference budget, the wrong problem just got the treatment.

The rest of this is about the second problem - and notice that nearly every headline above is the first one. Sprawl produces a spectacular bill that lands in the press. Product inference economics produces a feature that quietly stops earning its keep at scale, which never makes the news. The loud problem is the easy one; the expensive one is quiet. Put caps and routing on the sprawl and move on.

The spend often pays. So the question isn't cut or keep.

Before the playbook, the honest counterweight: the spend frequently does pay. Where AI is pointed at concentrated, measurable workflows - fraud detection, supply chain exceptions, support resolution - the returns are documented and, in the best cases, boring in their reliability. The price curve helps too. Inference prices for a fixed level of capability have been falling at rates with few precedents in computing history, so a feature that is marginally uneconomic today may clear comfortably in 18 months.

It may also not, and that is the whole point. Falling token prices don't save a feature whose consumption grows faster than prices fall. Per-token prices dropped by orders of magnitude between 2023 and 2025; total enterprise inference spend rose anyway, several times over. Cheaper tokens mean more tokens, not lower bills.

And much of the reported "no return" is really "nobody measured" - organizations have always struggled to quantify returns on technology, not just on AI - so part of the panic is a measurement gap, not a spending one.

So the defensible positions are narrower than the discourse suggests. "Cut AI spend" is sometimes right. "Keep investing through the noise" is sometimes right. The only position that is always wrong is undifferentiated spend - approved without a unit, extrapolated without a volume model, owned by no one. The question is never cut or keep. It is whether the thing is priced. Pricing it is a design act, not a procurement one, and here is how it's done.

Five questions that price a feature

This is the frame I run when someone asks whether an AI line item is too expensive. Each question has a priced answer - what good looks like - and an unpriced signal that tells you you've failed it. It works equally well for defending a budget and for killing one.

Is this a seat cost or a product cost?

Priced:

you can say which one you're in, because the fixes are different. Sprawl needs caps and routing; product cost needs design work.

Unpriced:

the reflex is a budget freeze - the right tool for sprawl, the wrong tool for a product line - and the same meeting tries to apply it to both.
What's the unit?

Priced:

cost per resolved ticket, per processed document, per generated quote. One unit of value, named, that the spend actually buys.

Unpriced:

nobody in the room can name it. Then the spend isn't governed. It's a hope with a budget attached.
What happens at 10x volume?

Priced:

a model, not a hope. Context growth, calls per task, retries, and the long tail of expensive edge cases, all projected at production volume against the value per unit - revenue earned or cost avoided.

Unpriced:

the pilot number is the plan. A linear cost curve meeting a flattening revenue curve is the most common quiet killer of AI features, and it is invisible until the volume arrives.
Who owns the number?

Priced:

one named person who sees the bill and the value in the same view, with the authority to change the design when the ratio drifts.

Unpriced:

the bill lands in cloud spend, the value lands in a business unit's metrics, nobody sees both, and the number gets discovered by a CFO - late, as a crisis.
What does being wrong cost?

Priced:

the fallback - human review, escalation, rework - is a line in the unit economics, not an asterisk.

Unpriced:

"it'll be right most of the time." That sentence is where the economics stop, not where they're done.

One feature, priced

Take a support-deflection feature: an assistant that resolves customer tickets a human would otherwise handle. This is the friendly case for AI economics, because there's an expensive human baseline to beat. Run it through the questions anyway and watch where the verdict actually sits.

It is a product cost (Q1). The unit is one resolved ticket (Q2). A fully loaded human-handled ticket costs 4.00. In the pilot, against a curated FAQ at low volume, the model resolves a ticket for 0.40. A ten-to-one win. Approve it.

Then production arrives (Q3). Queries are messier than the pilot's, retrieval injects more context per query, and a real share of tickets need multi-turn agentic handling with retries. Cost per ticket rises from 0.40 to 1.80.

And the model is wrong sometimes (Q5). Say it cleanly resolves 65% of tickets and the other 35% escalate to a human who finishes the job. Those escalations cost roughly the human price on top of the model cost you already spent trying. That fallback number is itself a design choice - an AI-assisted handoff can cost less than a cold ticket, or more if the customer is now annoyed and the case is harder - so take the full 4.00 as the conservative version.

Per ticket attempted	Cost
Human baseline (fully loaded)	4.00
Pilot AI cost (curated data, low volume)	0.40
Production AI cost (10x volume, deeper retrieval, agentic calls)	1.80
Human fallback on the 35% the model can't close (0.35 x 4.00)	1.40
Priced cost per ticket (1.80 + 1.40)	3.20

The pilot said 0.40. Priced, it's 3.20. The feature still clears the 4.00 baseline, so the answer is ship it - but the margin is 0.80, not the 3.60 the pilot implied, and it now lives or dies on two numbers nobody modeled: how fast volume and context push the 1.80 up, and the model's real close rate in the long tail, which drives the fallback.

Now let both drift the way they actually drift. You add document types, so production cost climbs to 2.50. The messy long tail turns out to close at 50%, not 65%, so escalation costs 2.00. Priced cost is 4.50 - more than the human it was meant to replace. Same feature, same demo, same enthusiastic board, and the verdict has flipped from ship to kill on two variables that were never on the page.

That is the difference between a priced feature and an unpriced one, and it's the difference a diligence makes that a strategy review doesn't. The team that shipped on the pilot's 0.40 didn't make a worse call than the team that priced it. It made the same call blind - and blind is how you ship the 4.50 version and hear about it from the CFO.

What you walk out with

One page. A row per cost component - pilot cost, production cost, fallback cost, priced cost - against the value per unit, with a 10x column and one owner's name at the top. Run the two variables that move (production cost, close rate) to the pessimistic end and see whether the verdict holds. That page is the deliverable, and an afternoon is enough to build the first version.

Notice what is not on the page: the model. Model choice is the variable teams obsess over and almost never the one that decides the economics. Integration depth, retrieval design, evaluation, and fallback behavior are where the money goes, and where it's saved.

The line

The companies in the sticker-shock headlines didn't have an AI cost problem. They had unpriced AI: capacity deployed without units, pilots scaled without volume math, bills accumulating without owners. Cost in an AI product is a design decision. Price it on one page, or keep getting procurement-sized surprises.

Dmitry Borodin leads AI Solutions at Octave, the Hexagon AB software spin-off. He co-founded B Productive and writes about what makes AI products ship versus die.

Sources

Axios, "AI sticker shock hits corporate America," 28 May 2026 - axios.com
Fortune, on the end of "tokenmaxxing," 28 May 2026 - fortune.com
G-P, 2026 AI at Work Report
Epoch AI, "LLM inference prices have fallen rapidly but unequally across tasks" - epoch.ai
MIT NANDA, "The GenAI Divide"