AI churn has IT rebuilding tech stacks every 90 days

AI tech churn is becoming a mounting problem for enterprises, which find themselves continually rebuilding their AI infrastructures in response to evolving AI capabilities, as well as AI strategies in flux.

According to a survey from AI data quality vendor Cleanlab, 70% of regulated enterprises — and 41% of unregulated organizations — replace at least part of their AI stacks every three months, with another quarter of both regulated and unregulated companies updating every six months.

The survey, of more than 1,800 software engineering leaders, underscores how organizations still struggle both to keep up with the ever-changing AI landscape and to deploy AI agents into production, says Cleanlab CEO Curtis Northcutt.

Just 5% of those surveyed have AI agents in production or plan to put them into production soon. Based on the surveyed engineers’ answers about technical challenges, Cleanlab estimates that only 1% of represented enterprises have deployed AI agents beyond the pilot stage.

“Enterprise agents are totally not here, and they’re nowhere near what people are saying,” Northcutt says. “There are literally hundreds of startups that have tried to sell components of AI agents for enterprises and have failed.”

The speed of evolution

Even without full production status, the fact that so many organizations are rebuilding components of their agent tech stacks every few months demonstrates not only the speed of change in the AI landscape but also a lack of faith in agentic results, Northcutt claims.

Changes in the agent tech stack range from something as simple as updating the underlying AI model’s version, to moving from a closed-source to an open-source model or changing the database where agent data is stored, he notes. In many cases, replacing one component in the stack sets off a cascade of changes downstream, he adds.

“When you go to an open-source model that you run on your own server, your whole infrastructure changes, and you have to deal with a lot of things you weren’t dealing with before, and then you might go, ‘That was actually worse than we expected,’” Northcutt says. “So you go back to a different model, but then you switch to cloud, and the cloud API is actually totally different than the OpenAI API, because they are not in agreement.”

Cozmo AI, a voice-based AI provider, has also observed a pattern of frequent changes in agent tech stacks, says Nuha Hashem, cofounder and CTO there. The Cleanlab survey matches the churn Cozmo sees across regulated environments, she says.

“Many client teams swap out parts of their stack every quarter because the early setup is often a patchwork that behaves one way in testing and a different way in production,” she adds. “A small shift in a library or a routing rule can change how the agent handles a task, and that forces another rebuild.”

While the speed of AI evolution can drive frequent rebuilds, part of the problem lies in the way AI models are tweaked, she says.

“The deeper issue is that many agent systems rely on behaviors that sit inside the model rather than on clear rules,” Hashem explains. “When the model updates, the behavior drifts. When teams set clear steps and checks for the agent, the stack can evolve without constant breakage.”

Low levels of faith

Another problem seems to be low satisfaction with existing components of AI stacks. The Cleanlab survey asked about user experience with several components of agent infrastructure, including agent orchestration, fast inference, and observability. Only about a third of those surveyed say they are happy with any of the five components listed, with about 40% saying they are looking for alternatives for each of them.

Just 28% of respondents are satisfied with the agent security and guardrails they have in place, signaling a lack of trust in agent results.

While the Cleanlab survey may paint a bleak picture of the current state of agents, several AI experts say its conclusions appear accurate.

Jeff Fettes, CEO of AI-based CX provider Laivly, isn’t surprised that many enterprises rebuild part of their agent stacks every few months. He sees a similar phenomenon.

“What separates out the more successful organizations with respect to AI is their ability to iterate,” he says. “What you’re seeing there is companies haven’t let go of the old way of doing things, and they’re really struggling to keep up with how fast AI itself as a technology is evolving.”

For most other major IT platforms, CIOs go through a long evaluation and deployment process, but the rate of AI advancements have destroyed that timeline, he says.

“IT departments used to go through big arcs of planning, and then transform their tech stack, and it would be good for a while,” Fettes says. “Right now, what they’re finding is they get halfway through — or a small way through — the planning process, and the technology has moved so far they have to start over.”

Fettes sees many of his customers scrapping AI pilots as the technology evolves.

“It’s creating a situation where a lot of companies have to abandon existing use cases,” he says. “We know we’re obsoleting our own technology in a very short period of time.”

In addition to the fast-moving technology, the AI marketplace offers so many choices that it’s difficult for CIOs to keep up, Fettes says.

“There have been hundreds and hundreds of new companies that have flooded into the space,” he adds. “There’s a lot of stuff that doesn’t work. Sometimes it’s hard to figure it out.”

The risks of staying put

Tapforce, an app development firm, also sees enterprises rebuilding their AI stacks every few months, driven by constant evolution, says Artur Balabanskyy, cofounder and CTO there.

“What works now may become suboptimal later on,” he says. “If organizations don’t actively keep up to date and refresh their stack, they risk falling behind in performance, security, and reliability.”

Constant rebuilds don’t have to create chaos, however, Balabanskyy adds. CIOs should take a layered approach to their agent stacks, he recommends, with robust version control, continuous monitoring, and a modular deployment approach.

“Modular architectures allow leaders to destabilize the full stack as well as swap out components, when necessary,” he says. “Guardrails, automated testing, and observability are all essential to ensure production systems remain reliable even as tech evolves.”

Cleanlab’s Northcutt recommends IT leaders go through a rigorous process, including a detailed prerequisite description of what an agent is expected to do, before deployment.

“People are like, ‘Let’s have AI do customer support,’ and that’s a very high-level thing,” he says. “The number one step is, ‘Let’s define very precisely, exactly, where does AI start? What do we expect good performance to look like? What do we expect it to accomplish? What tools is it actually going to use?’”

The survey results suggest that widespread deployment of AI agents may still be years away, Northcutt says. He predicts the estimated 1% of organizations with agents in production will rise to 3% or 4% in 2027, with true agents in production reaching 30% of enterprises in 2030.

He believes AI agents will lead to major benefits, but he urges evangelists to cut back their rhetoric in the meantime.

“We can now use AI to get better at our jobs, but the whole idea of enterprise AI automating everything and agents in every product, it’s coming,” he says. “If we can just kind of keep it cool, guys, and set reasonable expectations, then all this money invested might actually play out.”

Read More from This Article: AI churn has IT rebuilding tech stacks every 90 days
Source: News

AI churn has IT rebuilding tech stacks every 90 days

The speed of evolution

Low levels of faith

The risks of staying put

Related posts