Applied AI

Why most AI pilots never ship — and what the survivors do differently

Matthew RogersFounder & CEO, Preux22 May 2026 · 6 min read

By 2026 the numbers are common knowledge among buyers: the large majority of enterprise AI pilots never reach production, and Gartner expects over forty per cent of agentic AI projects to be cancelled by the end of 2027. The interesting part is why — because the cause is almost never the thing teams spend their time arguing about.

The graveyard is operational, not technical

When the failures are analysed, they cluster around operating-model problems, not model quality: unclear success criteria, insufficient access to the data and tools the agent actually needs, and evaluation drift — the system quietly getting worse with no one watching the right number. None of those are solved by a better model. They are solved by deciding, up front, what "working" means and how you will know.

88%The share of enterprise AI proofs-of-concept that never reach wide-scale deployment, on IDC's research: of every thirty-three pilots a company launches, four graduate to production.

This is good news, oddly. A model-quality ceiling would be out of your control. An operating-model failure is entirely within it.

The pilots that survive are not the ones with the cleverest demos; they are the ones built like production software from the first week.

What the survivors do differently

They define success criteria before building. A specific, measurable target — handling time, error rate, deflection — agreed with the business, not discovered afterwards.
They build the evals first. Golden datasets and scorers that run in CI, so a regression is caught by a gate, not by a customer.
They solve data and tool access early. Most agents fail because they cannot reach the system that holds the answer — and giving them raw credentials is not the fix. A gateway is.
They keep humans on the decisions that matter. The agent moves the work; ownership of the consequential call stays legible and human.
They stay model-agnostic. The model is a swappable part. Coupling a business process to one vendor's model is a risk, not a strategy.

A monitor showing a line chart, glowing in a dark room — Evaluation drift is invisible in a demo: the number that matters is the one nobody put on a screen.

When you get a demo and something works 90% of the time, that's just the first nine.
Andrej Karpathy, on the gap between demo and product

Our stance

We only take AI work we believe can reach production. That means the unglamorous parts — success criteria, evaluations, governance, observability, a twelve-month view of what good looks like — are the start of the engagement, not an afterthought. If a problem cannot be framed that way, the honest answer is that it is not ready, and we will say so. In 2026, that is a more credible position than enthusiasm.

All insights

Applied AI

Why most AI pilots never ship — and what the survivors do differently

Matthew RogersFounder & CEO, Preux22 May 2026 · 6 min read

The graveyard is operational, not technical

88%The share of enterprise AI proofs-of-concept that never reach wide-scale deployment, on IDC's research: of every thirty-three pilots a company launches, four graduate to production.

This is good news, oddly. A model-quality ceiling would be out of your control. An operating-model failure is entirely within it.

The pilots that survive are not the ones with the cleverest demos; they are the ones built like production software from the first week.

What the survivors do differently

They define success criteria before building. A specific, measurable target — handling time, error rate, deflection — agreed with the business, not discovered afterwards.
They build the evals first. Golden datasets and scorers that run in CI, so a regression is caught by a gate, not by a customer.
They solve data and tool access early. Most agents fail because they cannot reach the system that holds the answer — and giving them raw credentials is not the fix. A gateway is.
They keep humans on the decisions that matter. The agent moves the work; ownership of the consequential call stays legible and human.
They stay model-agnostic. The model is a swappable part. Coupling a business process to one vendor's model is a risk, not a strategy.

When you get a demo and something works 90% of the time, that's just the first nine.
Andrej Karpathy, on the gap between demo and product

Why most AI pilots never ship — and what the survivors do differently

The graveyard is operational, not technical

What the survivors do differently

Our stance

If this maps to a problem you're carrying, let's scope it.

Why most AI pilots never ship — and what the survivors do differently

The graveyard is operational, not technical

What the survivors do differently

Our stance

If this maps to a problem you're carrying, let's scope it.