AI agents will transform business processes — and magnify risks

According to Gartner, an agent doesn’t have to be an AI model. It can also be a software program or another computational entity — or a robot. When multiple independent but interactive agents are combined, each capable of perceiving the environment and taking actions, you get a multiagent system.

And, yes, enterprises are already deploying them. NASA’s Jet Propulsion Laboratory, for example, uses multiagent systems to ensure its clean rooms stay clean so nothing contaminates flight hardware bound for other planets.

Starting in 2018, the agency used agents, in the form of Raspberry PI computers running biologically-inspired neural networks and time series models, as the foundation of a cooperative network of sensors.

“It wasn’t just a single measurement of particulates,” says Chris Mattmann, NASA JPL’s former chief technology and innovation officer. “It was many measurements the agents collectively decided was either too many contaminants or not.”

The previous state-of-the-art sensors cost tens of thousands of dollars, adds Mattmann, who’s now the chief data and AI officer at UCLA. They also had extreme measurement sensitivity. But a new multiagent system had sensors that could be built for just hundreds of dollars each, but they weren’t as sensitive as the more expensive ones.

“The way to make up for that sensitivity was they had to work together, and share data and knowledge the way an agent would,” he says. It’s a system still being used today. But the evolution of AI means that agentic systems can now be used for a wider variety of problems. “The flashpoint moment is that rather than being based on rules, statistics, and thresholds, now these systems are being imbued with the power of deep learning and deep reinforcement learning brought about by neural networks,” Mattmann says. “The systems are fed the data, and trained, and then improve over time on their own.”

Adding smarter AI also adds risk, of course. “At least with things like ChatGPT, DALL-E 3 and Midjourney, there’s constant interaction with humans,” he says, adding that with agentic AI, there’s potential for autonomous decision making. “The big risk is you take the humans out of the loop when you let these into the wild.”

Meanwhile NASA isn’t alone deploying these early kinds of multiagent systems as companies that deal with operations and logistics have used these technologies for years.

“The notion of agents is actually very old,” confirms Anand Rao, AI professor at Carnegie Mellon University. “I used to work on multi-agent systems in the 1980s. We were building systems for the Space Shuttle, which was too complex for one system.” Over time, the agents have become more independent, he says, acting based on goals and objectives.

Then there’s Hughes Network Systems, a satellite communications and managed services provider, which has been using agentic AI for many years, says Dan Rasmussen, the company’s SVP and GM for the North America Enterprise Division, and he and his team use it to address service degradation issues. “We continuously feed network and customer equipment stats into our algorithms, allowing them to adapt to changing conditions and identify anomalies,” he says.

More recently, Hughes has begun building software to automate application deployment to the Google Cloud Platform and create CI/CD pipelines, while generating code using agents.

“Our goal is to analyze logs and metrics, connecting them with the source code to gain insights into code fixes, vulnerabilities, performance issues, and security concerns,” he says.

Hughes has already completed a successful proof of concept (PoC) for these use cases and is now developing them into a product.

The company has also developed its own internal, agentic AI tools, and uses different agentic frameworks for different projects, including Microsoft AutoGen, and is exploring crewAI and LlamaIndex.

“Hughes uses these tools to support the operation of networks servicing over one million remote nodes,” he says. At first, the tools just made recommendations for humans to review and act on. “After observing this system for a few months,” he continues, “Hughes allowed the process to run automatically and report on the implemented changes. We use the same review process for any new enhancements.”

But there are alarms that kick in and stop the automated behaviors if the number of anomalies detected or remediations implemented go beyond statistical norms, so the automation doesn’t go rogue and impact the network.

Proliferation of agentic AI

According to a Capgemini survey of 1,100 executives at large enterprises, 10% of organizations already use AI agents, more than half plan to use them in the next year, and 82% plan to integrate them within the next three years. Plus, 71% of respondents said AI agents will increase automation in their workflows, 64% said they’ll improve customer service and satisfaction, and 57% said the potential productivity improvements outweighed the risks.

Furthermore, of the companies that plan to use AI agents, the biggest use case was in software development, to generate, evaluate, and rewrite code, and 75% said they plan to use AI agents this way. It makes sense that development is the top agentic AI use case, says Babak Hodjat, CTO of AI at Cognizant.

“Most of us in AI are software engineers,” he says. “Also, software engineering is easier to verify, so you can have semi-supervised systems that can check each other’s work. That’s the first one that’s being tackled.”

More power, more responsibility

Blockbuster film and television studio Legendary Entertainment has a lot of intellectual property to protect, and it’s using AI agents, says Dan Meacham, the company’s CISO. “We leverage agentic AI across various verticals in our security programs,” he says. For example, AI agents use open source intelligence to hunt for movie leaks and piracy across social media and the dark web. He declined to say which specific frameworks were used to build the systems, but says it leverages an enterprise OpenAI-like solution that enables some business process automations.

When it comes to security, though, agentic AI is a double-edged sword with too many risks to count, he says. “We do lose sleep on this,” he says. Many risks are the same as gen AI in general since it’s gen AI that powers agentic systems. That means Meacham worries about creative content and assets being leaked through AI applications and AI generating infringing content.

Then there’s the risk of malicious code injections, where the code is hidden inside documents read by an AI agent, and the AI then executes the code.

“This attack vector isn’t new, as this is a classic SQL injection or database stored procedure attack,” says Meacham. “There are emerging mitigation techniques that leverage data loss prevention-type patterns to limit or exclude data types from being learned. Plus, some emerging solutions claim to inspect instruction sets as modules learn and grow to help prevent injections, hallucinations, and malicious code. By Q1 of 2025, we should have some real contenders in protecting AI that are more than just existing DLP and code review enhancements layered in front of or on top of LLMs.”

Sinclair Schuller, partner at EY, says there are a few main strategies to secure multi-agent AI, on top of guardrails already set up for underlying gen AI models. For example, an agent can have a particular personality, he says: “Is the agent long-lived or short-lived? Is it allowed to collaborate with other agents?”

Multiagent systems can also use consensus, asking peer agents to evaluate the work of others, or use adversarial agents to gut-check the original response, create a different one, and compare the two results. Enterprises also need to think about how they’ll test these systems to ensure they’re performing as intended. “That’s the most difficult thing,” he says.

Agentic systems can also be set up in such a way that the scope of what the agents can do is limited, he says, and you have to have humans in the loop. Insurance company Aflac is one company making sure this is the case to maintain human oversight over the AI, instead of letting it act completely autonomously. That means keeping humans engaged, says CIO Shelia Anderson, even as the company adopts an accelerated prototyping plan for agentic AI projects. These projects include those that simplify customer service and optimize employee workflows. “We’re taking the same approach to agentic AI as we have with gen AI and other emerging technologies,” she says.

That means the projects are evaluated for the amount of risk they involve. Low risk use cases involve back office applications or process enablement, and affect the way people do their jobs. Medium risks are those that involve internal data and internal uses. And high risk initiatives involve external users or protected data.

The company is still early in the process, she says, with Aflac’s innovation team currently evaluating use cases it will then explore in PoCs in the near term. That involves evaluating several models and platforms for agentic AI, including home-grown.

“Our higher-level AI strategy is positioning us for more purpose-built AI, which could include different models and platforms depending on how we intend to apply the technology to the value chain,” Anderson says.

Another risk of agentic AI is it can potentially impact those human workers in the loop because it can handle more complex business processes.

“The consideration of worker impact and higher worker productivity is both an opportunity and a risk,” she adds. The plan is to shift knowledge workers to higher value tasks, or to use the AI to help them make better decisions and improve the customer experience. So while Aflac is excited about the benefits AI can provide in the future, we also remain focused on supporting our customers with a personal touch they expect and often need.

Agentic AI in the early stages

Aflac isn’t the only company just beginning to start the agentic AI journey. Centric Consulting, for instance, works with a midsized regional property and casualty insurance company that uses two different vendors to collect customer emails related to insurance claims, and process those documents.

The company was paying half a million a year in license fees, says Joseph Ours, Centric’s director of AI solutions, and there was still a lot of manual work involved. So replacing the process with an agentic system could save the company around $1 million a year, making it worthwhile to invest in development costs instead of waiting for either one of the vendors to improve their products.

“And these two vendors were big and unlikely to buy each other, so you’re not going to get synergy,” says Ours.

Centric then built a custom agentic framework, with an LLM-agnostic back end powering the agents. Today, the platform can support OpenAI, OpenAI on Azure, Google’s Gemini, or Anthropic’s Claude. For example, OpenAI’s GPT-4o, which is multimodal, is used to handle scanned documents or images such as photographs of damage. And when a gen AI model would be overkill, such as disassembling an email into constituent pieces or looking up policy numbers, the platform uses software or function calls to handle the tasks instead. The system has passed the PoC stage and is now in pilot.

“The proof of concept didn’t have the security guardrails and the customer experience niceties,” Ours says. “We’re working on adding that in. The goal would be to do a phased pilot. We want to make sure everything holds up like we expect.”

When the system fully understands what’s coming in, what needs to be done with it, and where it needs to go, it’ll function autonomously, Ours says. “If at any point it doesn’t understand something or can’t find the right record, it gets kicked to the manual review.”

There will also be tools in place to capture precision and accuracy metrics, and help guard against drift, and the system is expected to be in production in Q4 this year, along with job training and other change management.

“AI is as disruptive as the industrial revolution was to agricultural societies,” says Ours. “We shouldn’t implement it without putting in some tools, or people are going to be resistant and you’ll get suboptimal results.”

Atlantic Health System, one of the largest non-profit health care networks in New Jersey, has also begun building a framework for leveraging agentic AI as part of its automation strategy.

“For now, we’re building workflows using retrieval augmented generation,” says Sunil Dadlani, the company’s EVP and chief information and digital transformation officer. This allows for LLM queries to be enriched with relevant context.

“We’re PoCing agentic AI as a way to expand out the type of workflows that can be supported in a more flexible, yet task-focused manner,” he adds.

The company is also exploring the possibility of using agentic AI in ITSM.

“We see tremendous opportunity in both leveraging the technology to improve our internal IT processes, and acting as a proving ground to build confidence in the technology for both us and the business,” he says.

To make this happen, Atlantic Health uses its own internal, digital enablement platform, and is exploring LangChain to orchestrate the flow of data between the LLMs in conjunction with Amazon Bedrock. This could be expanded into the basis of the company’s agentic AI framework.

“We’re also leveraging Dialogflow with the Google Cloud platform, and starting conversations on how we could leverage the Microsoft Bot Framework,” he says.

The cost of progress

Agentic AI offers many potential benefits in healthcare, but also introduces significant risks that must be carefully managed. Atlantic Health already has a framework in place to keep its gen AI safe, including data practices, robust security, human oversight, and transparent governance, combined with continuous monitoring, testing, compliance with legal frameworks, and accountability structures. But agentic AI systems are designed to operate with a certain level of autonomy, he says.

“Patient safety is also at risk if AI struggles with complex cases, or delays critical interventions,” he says.

Even large technology companies are, for the most part, still far from significant adoption. “We believe that multimodal agentic AI is the future,” says Caiming Xiong, VP of AI research and applied AI at Salesforce.

Multi-modal AI means the AI powering the agents can handle more than just text. There are already gen AI platforms that can handle images, audio, and even video. “There’s a lot of information that you can’t just describe in text,” he says.

But multiagent AI systems are still in the experimental stages, or used in very limited ways. One internal use case of agentic AI at Salesforce is for software development. “We use our own model, and the agentic framework we built ourselves,” Xiong says. “I don’t think we’re going to replace our developers, though. You can’t completely trust the code developed by AI, but we’re going to improve productivity and quality, and agentic coding assistants can help our junior developers become more professional.”

There’s also an agent framework that can pull together information from different sources to answer questions, solve customer problems, or suggest next steps, he says. Here, the back-end models are OpenAI’s GPT 4 and GPT 3.5. To keep the agentic AI on the straight and narrow, Salesforce is using all the guardrails it already has in place for gen AI.

“We analyze every question and answer,” says Xiong. “We analyze the toxicity, the bias, and look for prompt injection.”

On top of that, there are security and access controls to make sure the agents aren’t trying to get to information they shouldn’t.

“We also have guardrails in place to make sure its action is something it’s allowed to execute,” he says. “We have those components built into the agent framework, to make sure it’s not doing something wrong.”

That doesn’t mean it’s going to be perfect, however, so there’s human oversight built in. A human may be asked to confirm that a particular action should be executed, for example.

“And on the development side, we have teams to do evaluations before anything goes into production,” he says.

Any new system will have to be compliant with the guardrails that Salesforce has built, confirms Juan Perez, the company’s CIO.

“We built an entire trust layer in our generative AI solutions,” Perez says. “And we have an umbrella security practice and privacy practice that guides everything we do.” There’s also an AI council, he adds, composed of people from across the company from legal, privacy, the ethical AI use group, data people, technologists, and business users.

Meanwhile, there’s also the possibility that multi-agent systems might actually be safer than single monolithic gen AI models, says Xiong.

“If you only have a single system and it gets hacked, it’d be a huge disaster for a company,” he says. “But if you have a hundred or a thousand agents and one agent is hacked, it’s fine.”

Plus, each agent can be optimized for its specific tasks. If an LLM is optimized for a particular purpose, performance might suffer in other areas, but with multi-agents, one task can be isolated and improved.

Agents in production

Most companies deploying AI agents don’t do it as part of a complete end-to-end agentic AI process, says Forrester analyst Craig Le Clair.

“I just talked to 30 banks and investment companies and they all said the same thing: ‘We’re not ready to give an entire process to gen AI.’” Instead, he says, enterprises are adding AI agents to existing core processes where the whole process is under the control of a traditional process agent. For example, a business process might require generating an email based on some information, and gen AI can be used to create a more customized message, with other AI agents picking up other small pieces.

The most advanced level of end-to-end closed, autonomous systems — the business equivalent of a self-driving car — aren’t there yet, Le Clair says. But some companies say they’re moving closer to that point. One company with agentic AI systems already in production is Indicium, a global data consultancy with headquarters in New York and Brazil. These AI agents are serving both internal users and clients, says Daniel Avancini, the company’s chief data officer.

The agents are used to query and cross-reference data from a variety of sources, including Moodle, GitHub, Bitbucket, internal wikis, and the company’s Snowflake data warehouse. They use gen AI to interpret complex questions and identify the most relevant data sources.

“For example, one of our agents can pull information from our internal wiki, cross-reference it with data from our code repositories, and then validate it against our analytics data to provide comprehensive answers to business queries,” he says. “Some of our more advanced agents can actually construct solutions based on existing processes.”

For example, one agent can create a directed acyclic graph in Airflow based on a description of data pipeline needs, which involves complex, multi-step tasks.

Other agents are still in pilot phases, including one that can analyze code repositories and suggest optimizations based on best practices and historical performance data.

The primary framework for building these agents is LangChain, says Avancini, and

Indicium uses its LangGraph component, which offers granular control over its agents’ decision-making processes, he says.

“We might create a graph where the agent first analyzes the user’s query, then decides which data sources to consult, executes the necessary queries, and finally synthesizes the information into a coherent response,” he says. “At each step, we can implement decision points and fallback options.”

For powering these agents, Avancini says, OpenAI and Anthropic models are preferred, but the deployment strategy is cloud-agnostic. “We can deploy our agents on AWS, Azure, or GCP, depending on specific project requirements,” he says. “We access these models either directly through their APIs or via cloud services like AWS Bedrock or Azure OpenAI.”

For monitoring and observability, the company uses LangChain’s LangSmith, which allows Indicium to track performance, identify bottlenecks, and quickly iterate.

“In some cases, particularly for rapid prototyping or when working with less technical stakeholders, we employ visual development tools,” says Avancini. “Azure AI Studio, for instance, allows us to assemble agents visually and then export the results as code. This can be particularly useful when we’re exploring new agent architectures or demonstrating concepts to clients.”

To handle the memory requirements for the agent systems, Indicium uses vector databases, such as Pinecone.

“These databases allow us to efficiently store and query large amounts of unstructured data, which is essential for many of our AI applications,” he says. “For handling unstructured documents, we use tools like LlamaParse to help us extract meaningful information from a variety of document formats.”

Indicium has also built custom connectors for popular messaging platforms, so the agents can better interact with users.

Read More from This Article: AI agents will transform business processes — and magnify risks
Source: News