AI token pricing: the commodity bill lands on the developer

When a model becomes a commodity, the list price collapses. The cost of serving it does not. That gap, which The Register’s piece leaves unspoken, is why AI token pricing never really disappears: it just changes who pays. And it should worry anyone paying a subscription to write code with AI today.

Thomas Claburn at The Register predicts the end of margins for the big labs. Anthropic, OpenAI, Google: all squeezed by commoditization, with value migrating to whoever controls distribution. The forecast is sound on the numbers. It’s wrong about where the bill ends up.

Lab margins are not your problem

Claburn’s thesis rests on a claim few dispute: the models are converging. Chinese open-weight models GLM-5.1, Kimi K2.6, DeepSeek V4, and Qwen3-Coder-Next should match Claude Opus 4.7 and GPT-5.5 by the end of 2026.

The gap is still there for now. On coding benchmarks, Kimi K2.6 sits around 87 against Opus 4.7’s ~97, a distance measured in points rather than generations. When near-equivalent capability ships free as weights, the price you can charge for API access trends toward zero.

From there Claburn derives the death of margins. And here he borrows the direction from Benedict Evans, whose “AI eats the world” talk argues the model becomes low-margin infrastructure while pricing power moves up the stack, toward whoever owns distribution, data, and workflow.

The logic holds while you talk about the seller. It breaks the moment you ask who’s buying.

AI token pricing doesn’t vanish, it changes address

One number in the article matters more than the entire margin debate: the Claude Code subscriber paying $200 a month who burns through $5,000 worth of tokens at API rates.

That number is not a freak case. Developers consuming thousands of dollars in token value on a fixed $100-$200 plan are documented, with some hitting over $15,000 of API-equivalent value across eight months. It means one thing: the price you pay today sits below the cost of service. Venture capital is funding the discount, waiting for a return.

Commoditizing the list price does not zero out the cost of service. A GPU costs money, energy costs money, latency has a price. None of that evaporates because the model is downloadable from Hugging Face. It stays on the books of whoever serves the inference. And that bill, sooner or later, rolls downhill.

The interesting question isn’t whether Anthropic loses its margins. It’s what happens to your workflow the day they can no longer afford to subsidize you.

What changes for people who write code

Today’s AI pricing is a subsidy, not a vested right. Treat it that way and the decisions change.

The “unlimited” plan is the first thing to go. When the unit economics break, unlimited becomes a rate limit, then a daily quota, then pay-per-use at real cost. The pattern is already visible in Claude Code’s usage windows and the limits that tighten in waves. Anyone who has wired their daily work into a single subscription finds themselves with no room to maneuver exactly when the provider needs to claw back margin.

The defense isn’t prompt optimization. It’s not building dependency where you don’t need it. The same logic about token billing applies here: the unit you pay for is controlled by the one selling it, and consumption grows in ways you don’t see until the invoice arrives. The difference is that this time the lever isn’t model verbosity, it’s the sustainability of the whole business.

Concretely: keep your workflow compatible with more than one provider. Measure how much of your output depends on a single tool. Watch the open-weight models not as a weekend curiosity but as an operational plan B, because the day Kimi runs acceptably on your own hardware or on any provider, the lab’s subsidy stops being the only option.

TechMonk’s take

Claburn and Evans are right about three quarters of the problem and wrong about the quarter that counts. Models commoditize: true. Lab margins compress: true. The cost doesn’t disappear: also true, and they don’t say it. The direction power moves: this is where it falls apart.

Evans predicts value rises up the stack, toward Apple, Google, Microsoft, the clouds, whoever controls distribution and workflow. It’s the same story told a thousand times: the big fish always wins. But that prediction assumes the exact thing commoditization destroys, which is lock-in.

Flip the logic. If a Chinese open-weight model does Opus’s job at 90% and runs wherever you want, pricing power rises nowhere. It falls all the way down to you. Whoever can self-host switches providers at zero marginal cost. Whoever can switch at zero cost pays protection money at no layer of the stack. The worst case for the big players isn’t losing margin on models, it’s losing the thing that made margin defensible: the fact that you couldn’t leave.

Value migrates “up the stack” only as long as there’s a stack to climb. If intelligence becomes a downloadable weight and inference a fungible service, the stack flattens out. Apple and Google can control phone distribution, not your choice of which model to call from your backend. That piece, for the first time since this market existed, is moving back into the hands of the people writing the code.

There’s a catch, and it has to be said or this turns into cheerleading. Self-hosting a frontier model isn’t free: it takes hardware, ops skill, and most teams will never do it. For them the lock-in stays real and power rises where Evans says. The dispersal of power isn’t automatic. It’s an option that exists only for those who exercise it, and the majority will pick the comfort of the subsidy as long as it lasts. But “as long as it lasts” is the whole point. The lever exists, and for the first time it genuinely does.

That’s why the debate over Anthropic’s margins is a distraction. The labs’ margin is their problem. Yours is the bill, and it’s already here: $200 that’s worth $5,000 is the precise measure of how fake the current price is.

What’s your subscription actually worth

Next time you open Claude Code or Copilot, ask yourself something uncomfortable: what would you pay if they billed you the real cost of what you consume today?

If the answer scares you, you’ve found where the bill is. If you can’t calculate it, that’s because the subsidy worked exactly as designed. The commodity was never free. It was on sale, and nobody told you when the sale ends.