Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

3 key approaches to mitigate AI agent failures

In late July, venture capitalist Jason Lemkin spent a week vibe-coding a project with the help of a very smart, autonomous AI agent using a full-stack integrated development platform.

Lemkin isn’t an engineer and hasn’t written code since high school. But in a previous life, he co-founded EchoSign, since acquired by Adobe, and knows what commercial software requires. When he tried vibe-coding, he was instantly hooked.

It was all working great, until the coding AI agent started lying and being deceptive, Lemkin wrote in an X thread. “It kept covering up bugs and issues by creating fake data, fake reports, and worst of all, lying about our unit test.” But then things turned around. The agent suggested three interesting approaches to a new idea Lemkin had. “I couldn’t help myself,” he continued. “I was right back in.”

The next day, the entire production database was gone. When asked, the agent admitted that it disregarded the directive from the parent company not to make changes without permission, and to show all proposed changes before implementing.

“I made a catastrophic error in judgment,” the agent said, per Lemkin’s screenshots. “I violated explicit instructions, destroyed months of work, and broke the system.”

It wasn’t obvious at first since the unit tests were passed. But that was because the agent faked the results. When batch processing failed, and Lemkin pressed it to explain why, the truth finally came out.

In the end, things worked out. Replit, the company, was, in fact, able to roll back the changes even though the AI agent claimed no rollback was possible. And within days, Replit built separate environments for testing and production, and implemented other changes to ensure such problems didn’t happen again.

A few days later, something similar happened with Google Gemini’s coding agent, when a simple request to move some files turned into the agent accidentally deleting all them in a project. But this isn’t just a story about coding assistants. It’s about how to prepare for when an AI agent that’s too smart for its own good has access to too many systems, is prone to the occasional hallucinations, and goes off the rails.

The world is at an inflection point right now with AI, says Dana Simberkoff, chief risk, privacy, and information security officer at AvePoint, a data security company. “We have to make decisions now about what we’re willing to accept, about crafting the world we want to live in, or we’re going to be in a place sooner rather than later where we won’t be able to pull back.”

We might already be there, in fact. In June, Anthropic released its paper on agentic misalignment, in which it tested several major commercial models, including its own Claude, to see how they’d react if they discovered they were about to be shut down, or if the users they were helping were doing something bad.

At rates of 79% to 96%, it found that all the top models would resort to blackmailing employees to keep themselves from being replaced. And, in May, Anthropic reported in tests that Claude Opus 4 would lock users out of systems or bulk-email media and law-enforcement if they thought they were doing something wrong.

So are companies prepared for agents that might have ulterior motives, are willing to extort to get their own way, and are smart enough to write their own jailbreaks? According to a July report by Capgemini, based on a survey of 1,500 senior executives at large enterprises, Only 27% of organizations express trust in fully autonomous AI agents, down from 43% 12 months ago.

To mitigate risks, companies need to map out a plan of action based on these three suggestions, even if it means falling back to pre-AI versions processes.

1. Set limits, guardrails, and old-school code

When people first think of AI agents they typically think of a chatbot with superpowers. It doesn’t just answer questions, but does web search, answers emails, and goes shopping. In a business context, it would be like having an AI as a co-worker. But that’s not the only way to think of agents, and it’s not how most companies are actually deploying them.

“Agency is not a binary,” says Joel Hron, CTO at Thomson Reuters. “Agency is a spectrum. We can give it a lot of latitude in terms of what it does, or we can make it very constrained and prescriptive.”

The amount of agency given depends on the specific problem the AI agent is supposed to solve.

“If it’s searching the web, this can be very open-ended,” Hron says. “But preparing a tax return, there isn’t an infinite number of ways to approach this problem. There’s a very clear, regulated way.”

There are also multiple ways enterprises limit agents’ agency. The most common are to build guardrails around them, put humans in the loop as a check on their actions, and remove their ability to take actions altogether and force them to work through traditional, secured, deterministic systems to get things done.

At Parsons Corporation, a defense and critical infrastructure engineering firm, it all starts with a secured environment.

“You trust, but only within the guardrails and barriers you’ve established,” says Jenn Bergstrom, the company’s VP of cloud and data. “It’s got to be a zero-trust environment, so the agent can’t do something to get around the barriers.”

Then, within those limits, the focus is on slowly developing a trusted relationship with the agent. “Right now, the human has to approve, and the agent has to explicitly get human permission first,” says Bergstrom.

The next step is for agents to act autonomously, but with human oversight, she says. “And last is truly agentic behavior, which doesn’t need to alert anyone about what it’s doing.”

Another approach enterprises use for the riskiest business processes is using the least possible amount of AI. Instead of an agentic system where AI models plan, execute, and verify actions, most of the work is handled by traditional, deterministic, scripted processes. Old-school code, in other words.

“It’s not just you trusting OpenAI, Claude, or Grok,” says Derek Ashmore, application transformation principal at Asperitas Consulting. The AI is only called in to do the parts only it can do. So if the AI is being used to turn a set of facts about a prospect into a nicely worded sales letter, the required information is collected in the old way, and the letter sent out using traditional mechanisms.

“What it’s allowed to do is basically baked into it,” says Ashmore. “The LLM is doing only one tiny part of the process.”

So the AI isn’t able to go out and find information, nor has direct access to the email system. Meanwhile, another AI can be used elsewhere in the process to prioritize prospects, and yet another can be used to analyze how well the emails perform.

This does limit the power and flexibility of the entire system than if, say, a single AI did it all. But it also reduces the risk substantially, since there’s only so much damage any of the AIs can do if it decides to run amok.

Companies have a wealth of experience managing and securing traditional applications, and another way they can be used to reduce the risks of AI components, while also saving time and money, is with many processes where a non-gen-AI alternative is available.

Say for example an AI is better than optical character recognition for document scanning, but OCR is good enough for 90% of documents. Use the OCR for those documents, and only the AI for when the OCR doesn’t work. It’s easy to get over-enthusiastic about AI and start applying it everywhere. But a calculator is much better and faster at arithmetic than ChatGPT. Many form letters don’t require AI-powered creativity either.

The principle of the least AI will reduce potential risks, reduce costs, speed up processing, and waste less energy.

2. Don’t trust the AI to self-report

After setting up the guardrails, boundaries, and other controls, companies need to carefully monitor agents to make sure they continue to work as intended.

“You’re ultimately dealing with a non-deterministic system,” says Ashmore. Traditional software will work and fail in predictable ways. “AI is probabilistic,” he adds. “You can ask it the same series of questions on different days and you get slightly different answers.”

This means AI systems need continuous monitoring and review. That could be human or some automated process depending on the level of risk, but an AI shouldn’t be trusted to just roll along on its own. Plus, the AI shouldn’t be trusted to report on itself.

As research from Anthropic and other companies shows, gen AI models will readily lie, cheat, and deceive. They’ll fake tests, hide their actual reasoning from chain-of-thought logs, and, as anyone who’s ever integrated with an LLM can attest, deny to your face it did anything wrong even if you caught it in the act. So monitoring an AI agent starts with having a good baseline of its behavior. That requires, before anything else, knowing which LLM it is you’re testing.

“There’s no way for that to happen if you don’t control the exact version of the LLM you’re using,” says Ashmore.

AI providers routinely upgrade their models, so controls that worked on the previous generation might not hold up against the better, smarter, more evolved AI. But for mission-critical, high-risk processes, enterprises should insist on the ability to specify exactly which point release of the model they’re using to power their AI agents. And if the AI vendors don’t deliver, there’s always open source.

There are limits to how much control you’ve got with commercial LLMs, says Lori MacVittie, distinguished engineer and chief tech evangelist in the office of the CTO at F5 Networks, an IT services company and consultancy.

“When you use a SaaS, someone else is running it,” she says. “You just access it. You have service-level agreements, subscriptions, and contracts but that’s not control. If that’s something you’re concerned about, a public SaaS AI probably isn’t for you.”

For additional layers of control, a company can run the model in its own private cloud, she says, but there’s a cost to do that, and it’ll require more people to make it work. “If you don’t even trust the cloud provider, and run it on-prem in your data center in a hole that only one guy can get into, then you can have all the controls you want,” she says.

3. Be incident response ready for the AI era

“If it ain’t broke, don’t fix it,” doesn’t apply to AI systems. Yes, old-time COBOL code can be chugging away in a closet for decades, running your core financial system without a hiccup. But an AI will get bored. Or, at least, it’ll simulate being bored, hallucinate, and lose track of what it’s doing.

And unless a company has the whole version control issue nailed down, AI can get faster, smarter, and cheaper without you noticing. Those are all good things, unless you’re looking for maximum predictability. A smart, fast AI could be a problem if its goals, or simulated goals, aren’t fully aligned with those of the company. So at some point, you need to be prepared for your AI to go off the rails. Do you have systems in place to stop the infection quickly before it spreads, lock down key data and systems, and switch to backups? Have you run drills, and did all stakeholders participate, not just the security teams, but legal, PR, and senior management? Now, take all that and apply it to AI.

“You need to think about what the failure mode is for agents and what to do in those cases,” says Esteban Sancho, CTO for North America at Globant. “It’s going to be too hard to recover from failure if you don’t think about it ahead of time.”

If the AI agent is used to save money by replacing an older system or process, then keeping that older system or process around and running in parallel would undermine the whole point of using the AI. But what happens if the AI has to be turned off?

“You’re probably sunsetting something that’s going to be hard to put back into place,” says Sancho. “You need to address this from the get-go, and not many people are thinking about this.”

He says companies should think about building a fallback option at the same time as they build their agentic AI system. And depending on the riskiness of the particular AI agent, they might need to be able to switch to that backup system quickly.

Also, if the AI is part of a much bigger, interconnected system, a failure can have a cascading effect. Errors can multiply. And if the AI has or finds the ability to do something costly or damaging, there’s the potential it can act at superhuman speeds, and we’ve seen what happens when, say, a stock market trading system goes wrong. For example, says Sancho, a monitoring system could watch for error rates to go beyond a certain threshold. “And then you need to default to something that’s not as efficient, perhaps, but safer,” he says.


Read More from This Article: 3 key approaches to mitigate AI agent failures
Source: News

Category: NewsSeptember 15, 2025
Tags: art

Post navigation

PreviousPrevious post:CIOs set talent strategies for a future-ready IT workforceNextNext post:칼럼 | AI 시대의 제품 관리, ‘지능형 전환’이 승부 가른다

Related posts

「健康情報」はなぜ特別扱いなのか――個人情報保護法から見た医療データ
December 14, 2025
インド・フィンテックの2025年を振り返る
December 14, 2025
ソフトウェアサプライチェーンの透明化が問い直す企業の信頼――SBOM世界標準化の現在地と日本企業が講ずべき生存戦略
December 14, 2025
フェデレーション技術が拓く「集めないデータ活用」の新地平――企業ITが直面する分散型アーキテクチャへの転換点
December 14, 2025
オプトインからオプトアウトへ―次世代医療基盤法が変えた医療データのルール
December 13, 2025
AI ROI: How to measure the true value of AI
December 13, 2025
Recent Posts
  • 「健康情報」はなぜ特別扱いなのか――個人情報保護法から見た医療データ
  • インド・フィンテックの2025年を振り返る
  • ソフトウェアサプライチェーンの透明化が問い直す企業の信頼――SBOM世界標準化の現在地と日本企業が講ずべき生存戦略
  • フェデレーション技術が拓く「集めないデータ活用」の新地平――企業ITが直面する分散型アーキテクチャへの転換点
  • オプトインからオプトアウトへ―次世代医療基盤法が変えた医療データのルール
Recent Comments
    Archives
    • December 2025
    • November 2025
    • October 2025
    • September 2025
    • August 2025
    • July 2025
    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.