Bringing AI Workflows into Production without Burning Tokens
Adopting AI (or the abilities of an LLM) into production is a core metric or goal for most engineers today. In this article we look at the best way to bring AI into production while keeping the token costs under control such that the cost vs benefit equation lands in the benefits bucket and adds value to the business.
Let’s make it Agentic!
The push in the market is to use agentic flows. Agentic is when you let a model decide how to process a request or a flow and expect its abilities to parse and understand context to result in the best possible outcome for the use case. The idea is that as models mature and become more “intelligent” the outcomes become more high quality and beats a human coded fixed algorithm.
With this in mind, oftentimes you’d see a use case pushed into production which relies on model calls completely.
For example, an agent may execute a use case by parsing input, validating data, classifying the request, checking policy, routing it to the right person, and drafting a response, all by calling a model.
It's often quite fast to build this, with the many agentic frameworks out there today and the demo is usually great and impressive to the management.
Launching this in production in a high volume use case will bring a shock though - when the bill arrives from the model providers. Token costs are increasing rather than decreasing as models evolve. Now is that use case adding sufficient value to the cost of running it?
What about questions such as consistency, latency, security and governance? And an even bigger question, do you really know why a decision was made?
I think there is a trend shift happening now in the market. The shift is to double check token spend to value creation. This shift is primarily among those who have already shipped a reasonable amount of use cases leveraging AI. The camp that is still working to deploy some use cases is not bothered by the spend yet as it hasn’t really hit them yet. But it's almost a certainty that once your budget starts to get eaten up, the question will come.
Going to Production with AI
The instinct to use AI for everything was not wrong. But does it always make sense? Teams and people are starting to ask which of the steps actually needs “intelligence” and which ones just need some rules or logic? This leads to answers for not just token spend, but the latency and consistency as well. Consistency means you know the reasons why the system is doing something.
But if we are not using AI, are we not losing out? Isn’t that what everyone says now? Get onboard or be left behind to be eaten by more modern competing companies.

The solution is to maximize the use of AI, but in a way that it yields maximum value and not just blindly at everything. I think an example is overdue for explaining this.
Expense approvals - This is quite common and every company needs it and typically it's done by a couple methods:
- A human manually reviews and approves each expense based on some published policy
- Some rules in a HR system that can automatically approve some expenses and route others to approvers
Let’s say the finance team of a company wants to be more dynamic, rapidly respond to changing trends and create a system that can benefit the company to maximize employee productivity - by letting them manage expenses that are not bound by rules set in stone!
An engineering team asked to build this could simply do this - ask the finance team to write the policy in a Google Doc or something - which can then be published to the internal portal as the official policy and then say build an AI agent that reads this policy and approves every request based on the policy. Now the finance team can update the policy every now and then, and without any developer in the loop, the policy can reflect on each expense approval request - Et Voila! Cool right?
Steps:
- User initiate a chat with an expense agent
- Exchange greetings (of course we humans always do that, agent or not!)
- Upload a receipt, explain the expense
- Agent parses the receipt, validates the amounts and dates
- Evaluates the entire expense policy against the request
- Decides on the request, informs the user
- If approved, make a request to the HR system to note the required expense reimbursement
This is a pretty cool agentic flow if you were to build it and I think the finance team and the entire company is probably going to be thrilled to use it. First, policies will start being practically applied (assuming the AI is intelligent) and the finance team has the flexibility to change it every Monday if they want to. Win win - And the CTO can present to the board on how they leveraged the intelligence of the models available today to add value to the business.
The big savings here is the manual approval times that the finance team would have to spend on without something like this. Or even more is the developer time required to keep changing policies as they change and lead time for doing so while expenses may not be processed as per the latest update.
So is this value worth the new token bills that may now start coming from the model providers?

Optimizing the AI
What could we do differently here? The models are worthy of use for sure. It's proven beyond doubt that it can be very effective in a lot of scenarios. A change up for the example above could be this:
On every policy update, steps:
- Read the latest policy
- Create a set of rules for basic cases extracted from the policy
- Create test scenarios for the human to verify
- Send the test scenarios and ruleset for finance team to approve
- Deploy rules into production
On every expense request, steps:
- Present a data entry form for user
- If user chooses to enter via an unstructured form, then run the model to convert it into structured
- Run the rules deployed
- if its matching, approve or deny as per rules
- If none matches
- Run the model to decide approval
- Or route to human for the lower volume
- Inform the user and invoke the HR system for reimbursement
Still using that model! But only where it matters and you can pretty much cut down the token costs by 80-90%. The more people who use structured input, there is not even a need for the model to parse the inputs. We still end up the model’s intelligence but in a more consistent way since it created static, deterministic rules. The large volume of requests will be now automatically approved by the updated rulesets for each policy update by the finance team. Same flexibility as the earlier flow, but with a lot lower token spend. And we get to use AI to do the one thing it's designed to do great - code things out! A developer in the loop can also review the code and ensure its consistent with the engineering standards just like any other code review. Win win win across the board.

Judgment vs. logic
It’s quite easy to decide when you need a model and when you need a simple logical interpretation. In every workflow step decide on this:
Does this step require understanding context, generating language, or making a nuanced decision, or does it just need to follow a rule?
It is a useful question because it removes some of the glamour from the architecture. Most steps, when you ask it honestly, are less mysterious than they first looked.
| Steps that need judgment | Steps that need logic |
|---|---|
| Classifying ambiguous or unstructured input | Routing based on a known field value |
| Summarizing a document or conversation | Validating a number against a threshold |
| Generating a policy, workflow, or config | Running a decision table at scale |
| Explaining a rejection in plain English | Parsing a structured form submission |
| Helping a non-technical user set up a rule | Executing that rule 50,000 times a day |
About our platform - Unmeshed
So since you are reading this - I’d love to share a bit about our platform - Unmeshed. Unmeshed helps teams build AI-powered workflows that combine model calls, deterministic rules, API integrations, human approvals, and observability in one place.
In Unmeshed, engineering teams can:
- Model API calls, rules, decision tables, human approvals, and AI steps in one workflow
- See which steps call models and why
- Attribute cost to specific workflows and outcomes
- Put budgets, scopes, and tool allow-lists around agentic execution
- Keep humans in the loop for high-risk or ambiguous decisions
- Move repeatable decisions from model calls into deterministic logic over time
The decision table is a rules engine and you can create complex business rules using AI or manually. This is where you could use models to create your rules, have humans validate them and then run them for effectively next to nothing compared to individual model calls for every use case execution.
If your team is adding AI into business processes and starting to ask where the cost, latency, and outcomes are coming from, Unmeshed gives you one place to design, run, observe, and optimize the workflow.
Unmeshed lets you mix model calls, deterministic rules, decision tables, and human approvals in one workflow - so you use AI where it adds value and logic everywhere else.
Bring your AI workflow and we can help you cut token costs without cutting capability.


