The fastest way to lose trust in an AI agent is to give it full autonomy on day one. The fastest way to never ship one is to wait until it's perfect. We've run agents across more than 100,000 live customer calls, and the model that actually works is the oldest one in management: you don't deploy an agent, you onboard one.
A new hire doesn't get the keys to the bank on their first morning. They shadow. They handle the easy cases under supervision. They earn scope by being reliable in a narrower scope first. Agents are exactly the same — and treating them that way is the difference between a system your operations team defends and one they quietly switch off.
01Start it on probation
Every agent we ship begins in a mode where it can propose but not act. On a voice line, that means it can hold the conversation, understand intent, and draft the action — book the appointment, issue the refund, update the record — but a human approves the action before it commits. The customer experience is fully automated; the consequence is still gated.
This feels like a half-measure to people who wanted "AI that just handles it." It isn't. Probation is where you discover the twenty edge cases your prompt never imagined, with a safety net underneath each one. You are not slowing the agent down — you are buying the evidence that will later let you speed it up safely.
Skipping probation to "move fast" is the single most expensive decision teams make with agents. The failures you'd have caught in week one as harmless suggestions instead become committed actions — refunds issued, appointments double-booked, a customer told something wrong — and now you're rebuilding trust with the business, not just fixing a bug.
02The transcript is the product
People think the model is the asset. For an operated agent, the asset is the transcript store — every turn, every tool call, every decision, every human override, captured and searchable. It is your training data, your debugger, your audit trail and your trust-building instrument, all at once.
When something goes wrong on call 40,312, "the AI messed up" is useless. "On this turn the agent misread the date because the caller said 'next Tuesday' on a Monday, and our date resolver assumed the current week" is a fix. You only get the second sentence if you logged the whole reasoning chain, not just the final output.
{
call_id: "c_40312",
turn: 14,
intent: "reschedule_appointment",
heard: "can we do next tuesday",
resolved: { date: "2026-06-16", confidence: 0.71 }, // low → gate
proposed: "move booking to Tue 16 Jun, 10:00",
action: "held_for_approval", // not committed
human: { decision: "edited", to: "2026-06-23" } // learns from this
}
That low-confidence date is the whole game. The agent didn't pretend to be sure. It flagged itself, the action was held, a human corrected it, and that correction becomes a labelled example for the next iteration. The transcript turned a near-miss into an improvement.
An agent without a transcript isn't an employee. It's a stranger making decisions you can't review.
— On observability as trust03The approval gate you never remove
Here is the rule we will not bend: any action that is hard to reverse keeps a human in the loop, permanently. Not until the agent is "good enough" — permanently. Moving money, deleting data, making a legal commitment, anything a customer can't easily undo: those stay gated regardless of how many calls the agent has aced.
The gate isn't a sign of an immature system. It's a design choice about consequence. Reversible actions — answering a question, drafting a message, looking something up — graduate to full autonomy fast. Irreversible ones don't graduate, because the cost of being wrong is asymmetric and no accuracy number makes "issued a refund to the wrong account" acceptable.
Across real customer-facing voice lines.
Shadow, gated-action, full — earned, not granted.
Every committed action is attributable and reviewable.
04Promote on evidence, not vibes
An agent moves from probation to autonomy on a specific class of action when the evidence says it's earned it — not when someone feels good about the demo. We define promotion criteria up front, per intent:
- Volume. It has handled enough cases of this type that the sample means something — not three lucky calls.
- Agreement rate. Humans approved its proposed action without edits above a threshold we set with the client's risk appetite.
- Failure profile. When it was wrong, it was wrong safely — it flagged low confidence rather than committing confidently to a mistake.
Hit all three for a given intent and that intent graduates: the gate comes off for the reversible ones, confidence thresholds relax. Miss them and it stays on probation, with the transcripts showing you exactly which cases to fix. Promotion is a data decision with a paper trail, which is also what lets a cautious client say yes.
05Design the bad call before the good one
The demo is the agent handling a clean, cooperative caller. Production is a caller on a bad connection, talking over the agent, asking three things at once, in an accent the speech model wasn't tuned for. Your system is defined by what it does in that call, not the demo.
So we design the failure path first. The agent must always know how to do three things: recognise it's out of its depth, hand off to a human cleanly with full context, and never leave the caller stranded in a loop. A graceful "let me get a colleague who can help with that" is a successful outcome. A confident wrong answer is the only real failure.
We don't optimise for "calls fully automated." We optimise for "calls resolved well" — which includes the ones the agent correctly handed to a person. An agent that knows its limits and routes accordingly outperforms an over-confident one on every measure a customer actually cares about.
06The onboarding playbook
If you're putting an agent in front of real customers, this is the order we'd run it in:
- Log everything from call one. Structured turn-level transcripts before you tune a single prompt. You cannot improve what you didn't capture.
- Ship in shadow or gated mode. Real traffic, real intents, zero irreversible autonomy. Let reality write your edge-case list.
- Set promotion criteria per intent, in writing. Volume, agreement rate, failure profile. Agree them with the client's risk appetite, not your enthusiasm.
- Keep the irreversible gate forever. Money, deletion, commitments — human-in-the-loop is a permanent design choice, not a phase.
- Treat handoff as success. Measure resolution, not automation rate. Reward the agent for knowing when to step back.
Done this way, autonomy stops being a leap of faith and becomes a ledger. Every scope the agent holds, it earned, with transcripts to prove it. That's the version of "AI that just handles it" you can actually put your name on — because you onboarded it like you'd onboard anyone you were going to trust with customers.

