Tokenefficiency defines the new boundary between decision augmentation, automation and autonomy.

In 2020 (pre-LLM), I published about the boundaries between decision automation and augmentation in long and short term business planning cycles. Talking to my colleague last Friday, it dawned on me that most of the principles I highlighted back then are still valid. But there is an impactful new trade-off to be made in the post-LLM world.

We are at the point that we can make almost any decisions using an LLM on our internal company data. That can be extended to external supplier and customer data if those supply chain partners allow us to use their data, albeit at a cost (retailers charge manufacturers to use their POS and other data).

For many decisions, with a selection of good prompts we can assess a situation, the risks, the options and a make decision around the best action forward. This is decision augmentation, but for what decisions would this make sense? Especially as complex LLM reasoning, or any LLM useage for that matter, consumes tokens. And tokens cost money.

In Uber’s case -a company aware of LLM, AI and augmented coding one would say – they blew through their IT budget in 4 months by getting very excited using Claude for their own development. The use of LLM’s and therefore tokens, is growing at a rapid rate across the globe. This is a warning for any company that applies internal uncontrolled vibe coding or any LLM usage for that matter. Tokens are becoming a serious line item in a P&L. And that cost might grow faster than some companies can fire their people with LLM driven assumed productivity gains!

What the laborer (the human) gets out of the token, let’s call it Tokenefficiency, will become an important metric. LLM will stop being a free for all. A human that doesn’t show tokenefficiency, being able to show clear outputs from token usage, will not last in the future workplace. Vibe coding is fun and makes sense if it has a clear goal. But if after 20 agentic prototypes nothing was scalable or didn’t stick, you just blew your budget by using tokens without anything to show for. A new LLM dashboard to present to the leadership team every week of month doesn’t make sense either. As your new creative LLM design puts a cognitive load on someone else every time. Better agree some fixed dashboards, so we all know what we’re looking at without deciphering an LLM output. It looks cool and exciting now, but soon the continuous re-allocation (from the LLM creator to the LLM result consumer) of cognitive load will get pushback as well. The free LLM fun games are soon over and a new trade-off between labor costs and token costs will develop.

So what about making decisions using tokens? In my commentary in 2021 on the HUMACHINE article from Nada Sanders, I differentiated the decision types; Operational, Planning, Strategic and Cultural. I’m now more inclined about combining this with the Cynefin framework on how to react to simple, complicated, complex, chaos and decisions during disorder.

Operational decisions that are simple or complicated, are usually also higher frequency and repetitive. They can be codified and automated or autonomous. Sure, you can use an LLM for brainstorming on how to solve a repetitive decision. But once you know how to solve the issue, codify and orchestrate it and automate in a smart process flow. The new trade-of in operational decisions will be the choice between smart automation and agentic autonomy. An agent that uses advanced (LLM) reasoning consumes tokens. If that advanced reasoning is required, or the speed or value of decisions, or any other thing LLM can do that discreet coding and automation can’t, go for autonomy. If not, don’t spend the money on the tokens, especially not for highly frequent and repetitive decisions. Although the LLM providers might try to convince you otherwise of course!

Complex, chaotic, strategic and cultural decisions will usually start with augmentation. From brainstorming to advice based on company and wider industry data. But token efficiency will start to play a role there as well. Over time efficient prompt streams will be developed that work well for certain type of situational and disruptive decisions. Those prompt streams on your data in your industry, and the tokenefficiency that comes with it, becomes valuable IP.

The human that has a good understanding of data, business, industry, the evolving relation with LLM and its prompting to get to quality decisions at speed will become valuable. The ones who do so with tokenefficiency and hence lower costs, will be the winners in the future of work.

Leave a comment