The next evolution of AI has arrived, and it’s agentic. AI agents are powered by the same AI systems as chatbots, but can take independent action, collaborate to achieve bigger objectives, and take over entire business workflows. The technology is relatively new, but all the major players are already on board.
In October, Microsoft announced that 100,000 organizations including Standard Bank, Thomson Reuters, Virgin Money, and Zurich Insurance are using Copilot Studio, double the number just months earlier. Copilot Studio allows enterprises to build autonomous agents, as well as other agents that connect CRM systems, HR systems, and other enterprise platforms to Copilot.
Throughout late 2024, Microsoft continued to expand its agentic offerings with purpose-built agents for specific use cases. Then in November, the company revealed its Azure AI Agent Service, a fully-managed service that lets enterprises build, deploy and scale agents quickly. And on AWS, Amazon Bedrock Agents have been available since 2023, but in December, Amazon added multi-agent collaboration capabilities.
Major enterprise software vendors are also getting into the agent game. Salesforce came out with Agentforce in October, then Agentforce 2.0 followed a couple months later. The upgrade includes a library of pre-built skills and workflow integrations, support for Slack, and better reasoning abilities.
Before that, though, ServiceNow announced its AI Agents offering in September, with the first use cases for customer service management and IT service management, available in November.
There are also pure-play agentic AI platform providers such as CrewAI and intelligent automation providers like UiPath. And that’s just the beginning. In a report released in early January, Accenture predicts that AI agents will replace people as the primary users of most enterprise systems by 2030. And in a January survey by KPMG of 100 senior executives at large enterprises, 12% of companies are already deploying AI agents, 37% are in pilot stages, and another 51% are exploring their use. But it’s not all smooth sailing since gen AI itself isn’t anywhere near perfect.
“There are risks around hallucinations and bias,” says Arnab Chakraborty, chief responsible AI officer at Accenture. “So it’s not just about the use case, but about having the guardrails.” Agents also can be difficult to build and expensive to deploy at scale.
Still, enterprises are already reporting success deploying AI agents for several use cases.
1. Software development and IT
Cognition released Devin, billed as the world’s first AI software engineer, in March last year. At the time, the best AIs couldn’t pass the 5% mark on the SWE-bench, a challenging benchmark designed to see how well AI can solve real-world coding problems. Devin scored nearly 14%. By August, agentic AI systems approached 40% and today, they’ve passed the 60% milestone.
Meanwhile, in December, OpenAI’s new O3 model, an agentic model not yet available to the public, scored 72% on the same test. According to a Capgemini survey released in mid 2024, 60% of executives at large companies say that AI agents will handle most of the coding in enterprises within three to five years.
But there are already some jobs specifically in the software development lifecycle poised to be aided by AI agents.
“We’ve developed our own agentic AI for code management,” says Charles Clancy, CTO at Mitre. “The best use case that seems to work well is in repository management, where it’ll go through and do bug fixes of code repositories.”
For example, he says, 10-year-old source code might no longer compile properly on a modern computer.
“The AI agent will download it, try to build it, and if it doesn’t run, it’ll fix the build scripts and code if necessary, check the code back into the repository, and flag it was done by an AI agent,” he says.
Mitre had to create its own system, Clancy added, because most of the existing tools use vendor-managed cloud infrastructure for the AI inference part. “We can’t do that for security reasons,” he says.
There’s also a separate research project that’s looking at 50-year-old mainframe code, and using AI to extract the business logic and rewrite it for a cloud-native framework, he says.
“Our goal is to modernize complex, mission-critical legacy IT systems in all government organizations,” he says.
There are millions of lines of code in these systems, which are written in COBOL, MUMPS, or even Assembly language tied to original hardware. “We’re developing our own AI models customized to improve code understanding on rare platforms,” he adds.
Mitre has also tested dozens of commercial AI models in a secure Mitre-managed cloud environment with AWS Bedrock. So far, over half a million lines of code have been processed but human supervision is required due to the risk of hallucinations and other quality problems.
“We’ve also found that agentic AI can work with tools developed for software engineers to dramatically increase the success rate of validating and compiling code,” Clancy says. That offers potential pathways to train new AI to reduce the need for supervision. “Even accounting for necessary human oversight, the process is moving faster every day.”
In December, Langbase released a state of AI agents report, based on over 3,400 responses from executives and technology professionals. The top use case for AI agents was software development, cited by 87% of respondents. In addition, 48% say they’re using LLMs in IT and operations.
2. Automation and productivity
Since AI agents can touch many systems, workflow automation and productivity are top use cases for enterprises. According to the KPMG report, administrative duties were the main use case for AI agents, cited by 60% of respondents. Take Avantia, for example, a global law firm, which uses both commercial and open source gen AI to power its agents. “The key challenge in our area is there are hundreds of tasks that might not be particularly well automated,” says CTO Paul Gaskell. “And they don’t lend themselves well to an SaaS solution. There are too many separate tasks in too many places.”
Now with Microsoft, AI agents can act as companions that sit inside Word or Outlook, ready to carry out tasks.
“If a customer asks us to do a transaction or workflow, and Outlook or Word is open, the AI agent can access all the company data,” he says. “And because these are our lawyers working on our documents, we have a historical record of what they typically do.”
The business benefit is that attorneys can get through the contracting process faster, respond to customers faster, and transact faster than anyone else.
Gaskell expects to see up to 45% improvement in margins by mid 2025. “We’ve done time and motion studies of what we’ve done already,” he says, “I find it difficult to see how this wouldn’t be the future of the professional services industry.”
Gaskell says his company is LLM agnostic, meaning the AI agents can be powered by different LLMs, depending on which one’s the best fit. That includes a couple of the major open source models, he says, because they offer privacy, cost advantages, and lower latency. The AI agents currently run in a hyperscaler, but the company is considering investing in its own GPUs and renting space in a colocation facility to reduce costs further.
Another company using agents to automate business processes is SS&C, a financial services and healthcare technology company.
“We get a lot of documents from 20,000 customers, in all sorts of formats,” says Brian Halpin, the company’s senior managing director of automation. These can be PDFs, digital forms, emails, and key information can be located anywhere, and presented in different ways. That adds up to millions of documents a month that need to be processed. “The ability to understand the context of a document is fundamental,” he adds, and in the past, this has been what’s hindered automation the most, and gen AI can help.
“So, today, we have 20 production use cases around documents with AI agents,” says Halpin. “That’s been positive and powerful.” The data is kept in a private cloud for security, and the LLM is internally hosted as well. SS&C uses Meta’s Llama as well as other models, says Halpin.
The system went into production in mid-2024, and processed 50,000 documents in November. “And we’ll keep ramping that up,” he says.
With traditional automation, humans had to look at almost every document, he says. With AI, that percentage is flipped. For example, with the loan document types, the automated percentage is in the low 90s, with only a few percent of documents needing manual review.
3. Customer Service & Support
At Dun & Bradstreet, AI agents help customers interact with the information the research company collects on 500 million of the world’s businesses.
“We serve 95% of the Fortune 500, who use our data to make some of their most critical decisions,” says Gary Kotovets, the company’s chief data and analytics officer. That includes credit decisions and supply chain decisions, he says. And the data is also used for sales and marketing.
“For us, agents are essential to interacting with our data,” he says. “They allow clients to ask a question related to a company and an AI agent will ensure the data is the most accurate information related to that company.” This isn’t always easy because many companies have similar names and addresses. “That’s where agents come in. Our agent says, ‘Let me make sure this company is the actual company they’re asking about.’ They’re able to understand the questions being asked.”
4. Content creation
Writing text and creating images were two of the first popular use cases for gen AI. Now, AI agents can turbocharge the content creation process. According to the Langbase survey, text generation and summarization was the second most popular use case, cited by 59% of respondents, followed by marketing and communications at 50%. And EY uses AI agents in its third-party risk management service.
“You hire us to evaluate some vendor you bring on board,” says Sinclair Schuller, principal at EY. “Our risk assessors do that work, spending up to 50 hours on one vendor, poring over contracts and other documents to produce a report that calls out risks we observe.”
That’s the way it was normally done, until gen AI came along.
“Now we can feed AI all the contact and public documentation, and it can spin out a report in minutes instead of days with tremendous accuracy and detail,” he says. Then human experts enhance those reports. “AI plus human expertise is a tremendous boost in quality,” he says.
Now, with agentic AI, the process is changing yet again.
“We’ll be releasing an agent-driven version of this process, where it’ll be a continuous monitoring of vendors, which was previously not possible,” he says.
This is something that companies often miss when they think about AI agents, he says. “A lot of people have focused on the optimization use cases,” he says. “But the real value is this expansion of the market, and expansion of revenue opportunities.”
5. HR and employee support
Another relatively low-risk, high-value use case for AI agents is answering employee questions and handling simple tasks on their behalf. A January IBM survey on gen AI development, in fact, concluded that 43% of companies use AI agents for HR.
Indicium, a global data services company, began deploying AI agents in mid-2024, for example, when the technology started to mature.
“You’d start seeing off-the-shelf applications — both open source and proprietary — that made it easier to build them,” says Daniel Avancini, the company’s CDO.
The agents are used to making things easier for HR, he says, including tasks such as internal knowledge retrieval, tagging, and documenting, as well as other business processes. Each agent is like a micro service, specializing in one particular thing. “And they all talk to each other in a multi-agent system,” he says. And these prompt-based conversations can get peculiar. The tricky thing is there’s a possibility of hallucinations and all the other problems that come with gen AI. “So there’s a lot of tweaking of the model so they don’t do the wrong thing or access the wrong information,” he says.
On the positive side, the AI agents can handle a lot of questions autonomously, so there’s a business benefit there. “And we’re finding things that aren’t correctly documented, so it helps us make the processes better,” he adds.
Trust but verify
Safety was a cornerstone of AI agent development from day one. In fact, one of the first agentic frameworks was BabyAGI, released in early 2023, which combined ChatGPT with a Pinecone vector database for memory, and LangChain for orchestration. The developer who created it jokingly asked it to create as many paperclips as possible — a reference to a hypothetical paperclip apocalypse caused by an unchecked AI — and the system immediately recognized the potential for problems and started by first generating a safety protocol for itself. But most agentic AI developers aren’t willing to put that much faith in the AI.
In a November LangChain survey of over 1,300 professionals, 55% of respondents said that tracing and observability tools are a must-have control for AI agents, helping them get visibility into agent behaviors and performance. In addition, 44% had guardrails in place, and 40% used off-line evaluation.
“AI models are risky and make all kinds of mistakes,” says Virginia Dignum, chair of the technology policy council at the Association for Computing Machinery, and professor at Sweden’s Umeå University.
But it’s possible to create systems to catch mistakes, she says, so if an agent isn’t able to accomplish a task, it would admit it failed instead of trying to make something up.
“There’s a lot of research on this and it’s there in theory,” she says. “But as far as I know, there’s not really a suitable agentic interface out there. And once you start to develop these systems, you’ll need to deal with the consequences and what happens if one of them does the wrong thing.”
That means there’s a need for governance and regulation. And agentic frameworks don’t just need to deal with the practical and business implications of possible AI mistakes, but legal implications as well.
“If those aren’t solved, then I don’t think there’ll be much use for enterprise agents,” she says.
Then there’s one more risk that enterprises need to deal with when deploying AI agents: disruption and negative outcomes caused by the scale of AI-powered automation that AI agents make possible. The change management process is very important when deploying these systems, says Pushpa Ramachandran, VP and global head AI at Wipro Technologies. “This is where I see a lot of customers take a bit more time,” he says. And taking the extra time up front means the company can go farther in the long run. “The ones who are thoughtful about the change management process can scale faster,” he says.
Read More from This Article: 5 top business use cases for AI agents
Source: News