Skip to content
All posts

The Quiet Failure of Agentic AI

When a traditional software project falls short, it announces itself: Outputs are visibly wrong, processes stall, users can point to exactly what broke. The failure is loud enough to trigger a fix.

When an agentic AI project fails, it can be harder to spot. Outputs that look plausible. Stakeholders do not reject the results outright; they just start second-guessing them. "That doesn't seem right" becomes a recurring theme. Confidence erodes gradually, not through a single catastrophic failure but through an accumulation of small doubts. By the time the problem is visible enough to act on, trust is gone, and the project gets quietly shelved.

The scale of the problem

This is not a niche issue. Gartner predicts that over 40% of agentic AI projects will be cancelled by the end of 2027, citing escalating costs, unclear business value, and inadequate risk controls. A March 2026 survey of 650 enterprise technology leaders found that while 78% have at least one AI agent pilot running, only 14% have successfully scaled one to production. The pilot-to-production gap is now the largest deployment backlog in enterprise technology history.

The Deloitte State of AI in the Enterprise 2026 report paints a similar picture: while 42% of companies believe their strategy is prepared for AI adoption, they feel less prepared when it comes to infrastructure, data, risk, and talent. The confidence is in the strategy slide deck. The gap is in the foundations underneath it.

These are not problems that a better model or different agentic platform solves.

The gap between a promising prototype and a production system that delivers genuine business value has almost nothing to do with AI capability. It has everything to do with three foundational questions that most organisations skip, rush, or get wrong.

“After last year's hype, executives are impatient to see returns on GenAI investments, yet organizations are struggling to prove and realize value.”

Rita Sallam, Gartner

1. Where does AI actually fit?

The first question is the most fundamental: is AI the right tool for this problem?

AI's defining characteristic is non-determinism. The same input can produce different outputs. That is a strength when the task requires interpretation, judgement, or making sense of ambiguous, unstructured information. It is a liability when the task demands consistency, auditability, and predictable outputs.

We have seen organisations deploy AI agents to process structured data that a well-designed ETL pipeline handles faster, cheaper, and more accurately. The agent was slower, less reliable, and harder to debug.

This "technology blindness" problem plays out across industries: Organisations get captivated by AI capabilities and deploy them where simpler systems would be the better choice.

The practical reality is that most business processes involve both types of work. An AI agent might interpret a customer request in natural language (non-deterministic, and genuinely valuable), but the fulfilment of that request should follow a defined workflow (deterministic, and should stay that way). Identifying that boundary, where the non-deterministic part ends and the deterministic part begins, is critical design work. Skip it, and you build an expensive, unreliable system that does things a rules engine could handle.

2. Can your agents reach what they need?

Once you have established that AI genuinely fits, the next question is whether the agent can actually access the data and systems it needs to be useful.

AI agents are only as good as what they can reach. Most enterprise data sits behind interfaces designed for human workflows: CRMs, ERPs, document repositories, legacy platforms. These systems were not built to expose their data or their operations for agentic consumption.

We have seen agent quality degrade significantly because of unstable APIs providing key context. The agent's reasoning was sound, but the data feeding it was intermittent, inconsistent, or stale. Separately, a lack of intrinsic access control in data sources meant that what started as a straightforward agent deployment turned into a complex application build, with custom access layers that had to be engineered between the agent and the systems it needed to interact with.

This is where many organisations underestimate the work. Data that is tolerable for human users, who can compensate for missing fields, inconsistent formatting, and outdated records, is not tolerable for AI agents. Agents amplify data quality problems. What is "good enough" for a dashboard is nowhere near good enough for an agent making decisions or generating responses.

And beyond data access, there is the question of what the agent can actually do. An agent that only reads and responds has a fundamentally different risk profile from one that can update records, send communications, or execute transactions. Defining that action surface, what the agent can read, write, and execute, is essential design work that determines both the value and the risk of the deployment.

3. How do you manage what you cannot predict?

The third question is the one that consistently catches organisations off guard: how do you operate a system that does not produce the same output every time?

The most common mistake we see is business cases built on the assumption of human replacement. The logic sounds compelling: the agent handles the workload, we reduce headcount, the savings fund the project. But this dramatically underestimates the amount of human oversight needed to ensure quality and provide fallbacks. The humans do not disappear. Their role changes, to supervision, quality assurance, and exception handling, and that role is often more skilled and more expensive than what it replaces.

Deloitte's 2026 report found that only one in five companies has a mature model for governing autonomous AI agents. That means 80% of organisations deploying agents are doing so without adequate governance structures. A KPMG survey reinforces this: 65% of leaders cite agentic system complexity as their top barrier, and 60% restrict agent access to sensitive data without human oversight.

The second pattern we see repeatedly is observability treated as an afterthought. Organisations deploy agents without instrumentation for monitoring response quality, tracing reasoning chains, or scoring outputs. When something goes wrong, and it will, there is no mechanism to diagnose the issue. Worse, there is no baseline to measure improvement against. Without quality scoring and benchmarks in place, rolling out changes to agent configuration is guesswork. You cannot manage what you cannot see, and you cannot improve what you do not measure.

The useful analogy is managing people. People are also non-deterministic. They interpret, they use judgement, and they sometimes get things wrong. Organisations have centuries of experience managing that through training, supervision, escalation procedures, and performance reviews. Agentic AI needs equivalent mechanisms. The organisations that treat agent deployment as a technology rollout, rather than an operating model shift, are the ones that end up with expensive systems that nobody trusts.

The foundations have not changed

The irony is that none of this is new. Fit assessment, data readiness, and operational governance are challenges that have existed since the first enterprise technology implementations. What has changed is the cost of getting them wrong.

With deterministic systems, poor foundations produce visible failures that get fixed. With non-deterministic systems, they produce the quiet failures we described at the start: plausible outputs that slowly erode trust until the project is abandoned. The 40% cancellation rate Gartner predicts is not a wave of dramatic crashes. It is a tide of quiet shelving.

The organisations getting real value from agentic AI are not the ones with the most advanced models or the largest budgets. They are the ones that did the unglamorous foundational work first: understood where AI genuinely fits and where it does not, ensured their data and systems could actually support agentic consumption, and built the operational infrastructure to manage non-deterministic systems in production.

That is not a technology problem. It is a transformation problem. And transformation problems have been solvable for decades, if you are willing to do the work.

Getting Started

If you are navigating any of these challenges and want a structured conversation about where your organisation stands, we would welcome that discussion.