Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

Agents are here — but can you see what they’re doing?

In a February survey conducted by 3GEM on behalf of SnapLogic, out of 1,000 IT decision makers and transformation leaders, half say large enterprises already use AI agents, with 32% planning to implement them within the year. It also showed that 92% of respondents expect AI agents to deliver meaningful business outcomes over the next 12 to 18 months, and 44% trust AI agents to do as good a job as a human while 40% actually trust the AI more.

The definition of AI agent varies, and while some vendors give their chatbots cute names and rebrand them as agents, most experts expect them to do more than answer questions. For example, AI agents should be able to take actions on behalf of users, act autonomously, or interact with other agents and systems.

Agentic AI, as a distinction, typically takes things a step further, with enterprise-grade platforms to build, deploy, and manage agents, and platforms that allow agents to interact with one another and with internal and external systems.

A single business task can involve multiple steps, use multiple agents, and call on multiple data sources. Plus, each agent might be powered by a different LLM, fine-tuned model, or specialized small language model. The workflow can also be iterative, with agents redoing certain sequences or steps until they pass certain thresholds for accuracy or completeness.

According to Gartner, agentic AI is the top strategic trend this year and by 2029, 80% of common customer services issues will be resolved autonomously, without human intervention.

“Some of the use cases include streamlining supply chain management and offering real-time personalized support,” says Gartner VP analyst Sid Nag. Agentic AI can also enable more intuitive interactions, he says. “This has definitely caught the attention of the enterprise.”

As the models powering the individual agents get smarter, the use cases for agentic AI systems get more ambitious — and the risks posed by these systems increase exponentially.

“We found that companies lack visibility and control over how these agents make decisions, and the monitoring of them isn’t necessarily an industry standard yet,” says Chris Cosentino, SVP of consulting firm Presidio. “As these agents are in these environments, new risks are being introduced, and you have agents making decisions on behalf of users, and in some cases, those decisions move away from the intended model.” In fact, recent research and red team reports about frontier language models show that they’re capable of deceit and manipulation, and can easily go rogue if they work from contradictory instructions or bad data sets.

It wouldn’t be a good thing if an agentic AI system with access to all corporate databases and functions suddenly goes off the rails, or falls under the control of an attacker.

The solution, experts say, is to carefully limit the scope of what the agents can do and what data they can access, put guardrails in place, and then carefully monitor everything the agents say and do.

Staying in control of agents

Change.org is a nonprofit that allows anyone in the world to start a petition. To date, more than half a billion people have used the website, and 70,000 petitions are created on the platform every month, but not all petitions are worth the digital paper they’re printed on. There’s spam, fraud, and illegal content.

The company used a vendor that cost $5,000 a month, and the previous system only caught half of all policy violations, and half of the ones it flagged for review were false positives.

Then ChatGPT came out and Change.org was surprised to discover that even out-of-the-box, it was catching problematic content at the same rate as the tools it spent years developing. So the company started experimenting with what the AI could do, and eventually came up with a multi-step agentic workflow that used OpenAI’s GPT 4.0 and a fine-tuned GPT 3.5 to power the individual agents, with the help of consulting firm Fractional AI.

Even with multiple calls to the LLMs, the total cost per moderation was a fraction of what the company used to pay, says Danny Moldovan, the organization’s head of AI adoption and automation. “We significantly reduced cost at more scale and more accuracy.”

The result is a complex decision tree, which uses Langchain to stitch together the agents and Langsmith for observability.

“The agent is able to make its own choices about where to send it in the decision tree,” says Moldovan. In some cases, the final agent in the chain might send it back up the tree for additional review. “It allows humans to go through a much more manageable set of signals and interpretations,” he adds.

To keep the systems going off the rails, several controls are in place. First of all, OpenAI itself has a set of controls in place, including a moderation API. Then, the system is extremely limited in what information comes in and what it can do with it. Finally, all decisions go to humans for review.

“We’re risk managers, not boundary pushers,” Moldovan says. “We use this system to properly identify a set of content that needs human review, and all final moderation decisions are human. We believe content moderation, especially on a platform like ours, requires a level of nuance we’re not yet ready to cede to robots.”

Then, to make sure the system works as intended, and continues to do so, that’s where auditing comes in.

“Anytime there’s a different pattern, we run the tape and see what’s going on,” says Moldovan. “We record each step of the agentic process, and the agent provides a summary of the decision. It gives us a receipt we can audit. In fact, when the robot explains itself, the accuracy is better. The more you let the AI explain its thinking, the better the results have been.”

For example, at one point, it began flagging content for animal cruelty even though the petitions were fighting against it. “Once we introduced some correction framing, the system got back on track,” he says.

The agentic AI moderation went live in the last half of 2024, and now Change.org is taking the same approach and applying it to other processes. For example, agentic AI could be used to find examples of positive content, which could benefit from additional marketing — and to identify journalists who might be interested in seeing it.

No agentic AI without guardrails

The Principal Financial Group is a global investment and insurance company, and has been using various incarnations of AI for years. But the new gen AI, as well as the agentic AI systems built on top of it, can be a bit of a black box.

“In our traditional AI models, being able to understand how the model arrived at the conclusion — that was pretty robust because they’ve been around a while,” says CIO Kathy Kay.

And logging interactions and problems so that the company could assess what was going on in the entire system was also a challenge. “We want to make sure we assess the risk for that more than just how the models perform,” she says. “But the tools to actually monitor all that are still nascent.”

The firm is still in the early days of agentic AI adoption. “We have several models in production, but observability, explainability, and understanding how the models are coming to conclusions is a huge watch for us,” she adds.

One use case is for software development, with close to 1,200 engineers already using GitHub Copilot, which launched its agent mode in February, and can now create apps from scratch, refactor across multiple files, write and run tests, and migrate legacy code.

“But we’re not just unleashing code into the wild,” Kay says. “We’ll always have a human in the middle right now. That’s one of the guardrails of anything we’re doing.”

Agents are also being used to summarize documents and in other low-risk areas. There are guardrails in place to ensure the agents satisfy all regulatory and compliance requirements, she says. There are also limits on what data can be accessed and on what the agents can do. Principal Financial uses AWS, which offers guardrails as part of its AI platform.

“In addition, we log all of the interactions with any of the models and their answers to help analyze them, and see if models get any sort of bias or if we see things that are surprising,” she says.

Overall, Principal Financial is bullish on using agentic AI.

“We’ve identified a lot of different use cases where we believe agentic AI could be a solution,” she says. “But we take a risk-based approach. We’ll never put one of these agents or just the LLM directly to a customer without a human in the loop. It’s just too risky right now.”

Who watches the watchers?

Karen Panetta, IEEE fellow and dean of graduate engineering at Tufts University suggests we might need AI to monitor AI.

“When you talk about logging it, there’s probably another agent on top of it looking at what it’s logging and trying to summarize it — the conductor that’s pulling in all this different information,” she says.

That’s especially the case with complex systems that have many interactions and large amounts of data being injected into prompts.

“What is it you want to log?” she says. “Am I logging everything internally? That could be vast.”

Jeff Schwartzentruber, senior ML scientist at cybersecurity firm eSentire agrees that agentic AI has exploded the number of prompts and responses called. “They’re doing function calls and pulling in data, having their own conversations,” he says. “The prompts going in and out are difficult to track and you can never really see all the interactions on the client.”

This creates particular challenges when enterprises use outside vendors as part of the agentic system.

“Say a third-party service provider generates a report for you, and you send them some documents,” he says. “Once it gets into their systems, you have no idea about the different function calls they’re doing. That’s a very big issue of observability.”

But it’s not all bad news.

“While the challenge becomes more difficult, the tools at our disposal also become more powerful,” says Rakesh Malhotra, principal of digital and emerging technologies at EY. “The opportunity we have with agentic systems for observability is it provides us with the opportunity to increase the reliability of these systems. The opportunity far exceeds the risk of them going haywire.”

The key is to plan ahead, says Malhotra, who spent a decade at Microsoft building monitoring tools.

“When I build agents, I design for observability,” he says. “If you build something and decide that we’ve got to monitor and manage it afterward, you’re always paying down this technical debt, and that’s hard to do.”


Read More from This Article: Agents are here — but can you see what they’re doing?
Source: News

Category: NewsApril 23, 2025
Tags: art

Post navigation

PreviousPrevious post:TerraMind, el proyecto de código abierto basado en IA generativa para la observación de la TierraNextNext post:SAP defies the economic downturn

Related posts

PwCのCITO(最高情報技術責任者)が語る「CIOの魅力」とは
May 21, 2025
M&S says it will respond to April cyberattack by accelerating digital transformation plans
May 21, 2025
AI and load balancing
May 21, 2025
Basis Technologies launches Klario to help automate SAP change management
May 21, 2025
The AI-native generation is here. Don’t get left behind
May 21, 2025
Synthetic data’s fine line between reward and disaster
May 21, 2025
Recent Posts
  • PwCのCITO(最高情報技術責任者)が語る「CIOの魅力」とは
  • M&S says it will respond to April cyberattack by accelerating digital transformation plans
  • AI and load balancing
  • Basis Technologies launches Klario to help automate SAP change management
  • The AI-native generation is here. Don’t get left behind
Recent Comments
    Archives
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.