The honest version of the pitch
An AI agent earns its keep in one specific situation: a high-volume task that needs judgement but not deep expertise, where the work is digital and the outcome is checkable. That's it. The closer your use case sits to that description, the better the economics. The further away, the more you're funding a science project.
Where agents pay for themselves
1. Support triage and resolution
The classic, because the shape is perfect: tickets arrive in volume, most are routine, every action is digital, and resolution is verifiable. An agent that reads the ticket, pulls the order or account, and either resolves it or hands a pre-investigated summary to a human changes the economics of a support desk. The win isn't only the tickets it closes — it's that every ticket a human touches arrives already researched.
2. Operations glue work
Every business runs on invisible swivel-chair work: copying data between systems, chasing status, reconciling spreadsheets, preparing the same weekly report. Individually trivial, collectively a full-time role or three. This is the least glamorous and most reliable agent ROI we see — partly because mistakes are cheap and reversible, so agents can run with light supervision from day one.
3. Internal knowledge access
"Who knows how X works?" costs more than anyone measures — senior people interrupted to answer questions the documentation already answers, badly. An agent with retrieval over your real docs, tickets and policies gives accurate, cited answers instantly. The benefit compounds: when the answer machine works, people start writing things down again.
4. Sales and lead operations
Not the dystopian version that spams prospects — the version that enriches inbound leads, scores them against your actual customer profile, drafts personalised first-touch for a human to approve, and keeps the CRM honest. Reps spend their hours talking to qualified humans instead of doing data entry.
How to measure ROI without fooling yourself
The arithmetic is simple; the discipline is rare. Three rules:
- Count fully-loaded cost. Inference is the visible line item. The real costs are the build, the evals, and the ongoing supervision — the human time spent reviewing agent output. If supervision time approaches the time the task took originally, you've built an expensive way to do the same work.
- Measure quality, not just throughput. "Agent handles 60% of tickets" means nothing if reopen rates double. Pair every volume metric with a quality metric: resolution accuracy, error rate, customer satisfaction.
- Baseline first. Measure the human-only process for a few weeks before launch. Without a baseline, every number afterwards is a story, not a result.
A pilot that can't fail wasn't a pilot — it was marketing.
When not to build an agent
We've talked clients out of agents more than once. The recurring reasons:
- The process is broken, not slow. Automating a bad process gets you bad outcomes faster. Fix the process first; it's cheaper.
- The volume isn't there. If the task happens twenty times a month, a checklist beats an agent. Agents amortise over volume.
- Errors are catastrophic and unreviewable. Medical decisions, legal commitments, large irreversible payments — agents can assist and prepare, but if no human can realistically review the output, don't put an agent at the end of the pipe.
- A simple automation already does it. If the task never branches, a script is cheaper, faster and deterministic. Use agents where judgement is genuinely required.
Start small, measure, widen
The pattern that works: pick one workflow with real volume and checkable outcomes, baseline it, ship a guarded agent, and review the numbers after a month. If they hold, widen the scope. If they don't, you've spent little and learned a lot. The transformation everyone promises does happen — one verified workflow at a time.
Want the numbers run on your workflow?
Bring us the process that eats your team's hours. We'll tell you honestly whether an agent pays — and build it properly if it does.