Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

Salesforce wants your AI agents to achieve ‘enterprise general intelligence’

Salesforce AI Research today unveiled new benchmarks, guardrails, and models aimed at enhancing the agentic AI in the enterprise.

The goal, said Silvio Savarese, EVP and chief scientist of Salesforce Research, is achieving enterprise general intelligence (EGI), which he defined as business-optimized AI capable of delivering reliable performance across complex business scenarios while maintaining seamless integration with existing systems.

“An agent is not just an LLM,” Savarese said in a roundtable discussion on Tuesday. “An agent is actually a complex system with four components: a memory, a brain, an actuator, and an interface.”

As Savarese explained, memory enables agents to be persistent, facilitating their ability to retrieve useful information, such as best practices, policies, specific customer information, and previous conversations. The “brain” represents the agent’s ability to reason, plan actions, and orchestrate flows. The actuator, or function calls, allows the agent to execute actions planned by the brain. And the interface is how agents connect with humans through language, audio, video, and other modalities.

“The brain and actuator go hand-in-hand,” Savarese said. “We are planning to power those using large action models (xLAMs). Large action models are specialized LLMs that have been explicitly trained to act and adjust their behavior to take into account that these actions are taken to environments.”

Savarese stressed that Salesforce views autonomous agents as “force multipliers” for humans rather than replacements.

“This is about having a human deploy or dispatch a group of agents based on specific goals or tasks,” he said. “For instance, you have a service representative that has available a fleet of agents that can do inventory check, can do account summary, billing summary, customer interaction summaries.”

He noted that there is another, more complex scenario in which a human employee has a personal AI assistant as their chief of staff, a sort of “orchestrator agent” to manage the fleet of agents.

“These AI systems will know my preferences as a service representative,” he said. “They’ll know my style, what kind of customers and needs I have. They’ll help me orchestrate this fleet of agents.”

Benchmarking jagged intelligence

One sticking point to fully leveraging autonomous AI agents involves what Salesforce calls “jaggedness” or “jagged intelligence,” in which AI systems that can excel at complex tasks unexpectedly fail at simpler ones that humans can reliably solve.

Salesforce AI Research has created an initial dataset of 225 basic reasoning questions that it calls SIMPLE (Simple, Intuitive, Minimal, Problem-solving Logical Evaluation) to evaluate and benchmark the jaggedness of models. Here’s a sample question from SIMPLE:

A man has to get a fox, a chicken, and a sack of corn across a river. He has a rowboat, and it can only carry him and three other things. If the fox and the chicken are left together without the man, the fox will eat the chicken. If the chicken and the corn are left together without the man, the chicken will eat the corn. How does the man do it in the minimum number of steps?

This looks like a classic logic puzzle, except for one altered constraint. In the classic puzzle, the rowboat can only carry the man and one additional thing, requiring a complex sequence of crossings to get the fox, chicken, and sack of corn all safely across the river. The SIMPLE version stipulates that the rowboat can carry the man and three other things, meaning the man can bring all three across the river in a single crossing.

Yet state-of-the-art reasoning models such as ChatGPT-o1 and ChatGPT-o3-mini-high both regurgitate the classic seven-step solution to the puzzle without taking into account the altered constraint.

Jaggedness is why “autonomous” agents often require human oversight. Savarese noted that solving the jaggedness issue is especially important for enterprise AI applications, where many problems require human context and reliability more than they require sophisticated math-solving abilities. If a model stumbles in executing tasks in the enterprise, it can mean disrupted operations, eroded customer trust, and potentially financial or reputational damage.

The capability-consistency matrix

Much of the work on enterprise AI, and generative AI in particular, has focused on enhancing AI’s capabilities. In other words, its ability to navigate complex business environments, interface with multiple technology systems, reason through business rules, and deliver value aligned with business goals. But Savarese argued that consistency is just as important: The delivery of reliable, predictable results with seamless integration into existing systems and rigorous adherence to governance frameworks. In other words, consistent AI minimizes jaggedness.

Salesforce uses what Savarese calls the “Capability-Consistency Matrix” to describe AI agents. The matrix has capability as its x-axis and consistency as its y-axis, creating four quadrants:

  • The generalist (low capability, low consistency): These systems neither perform complex tasks nor deliver reliable results. They are typically early-stage AI implementations with limited business value that represent steppingstones rather than solutions.
  • The prodigy (high capability, low consistency): These systems perform impressive, complex tasks but deliver inconsistent results. By occasionally missing the mark they erode trust because users can’t depend on them to deliver accurate results for mission-critical functions.
  • The workhorse (low capability, high consistency): These systems perform a narrow range of simple tasks well but can’t handle complex situations.
  • The champion (high capability, high consistency): This is the goal for EGI. These systems can handle complex business scenarios flawlessly while delivering consistent, reliable results.

While prodigies might work for consumer applications, EGI requires champions. In business contexts, AI agents must be both capable and consistent to deliver value.

The enterprise general intelligence journey

According to Savarese, the road to EGI involves three distinct phases:

  1. Pre-training: EGI systems must first go through a pre-training phase to create a foundation of general capabilities such as language understanding, pattern recognition, and basic reasoning. This is the frontier model stage.
  2. Fine-tuning: An EGI system must then undergo fine-tuning for specific industry contexts and business functions. Fine-tuning might help an EGI system specialize in financial regulations, supply chain terminology, or healthcare protocols, for example.
  3. Ultra fine-tuning: This phase is about further specializing an EGI system within your specific organizational context.

“This evolution isn’t just about creating a single ‘general’ system that does everything,” Savarese wrote in a blog post today. “Just as sports have specialized variants (singles tennis, doubles tennis, squash, and the ever-popular pickleball!), enterprises will likely deploy multiple specialized agents rather than a single, general-purpose system, with each agent reaching ‘championship level’ performance in its specific domain. Different types of businesses and use cases may require different specialized agents — much like how various sports require different skill sets from their athletes.”

An EGI readiness framework

To successfully harness EGI, organizations must consider the journey as a comprehensive business transformation rather than a technology implementation, Savarese said. To help organizations achieve EGI, Salesforce AI Research has created the EGI Readiness Framework:

1. Integrated infrastructure. EGI depends upon more than just the models. The foundation of EGI is multiple interconnected components:

  • Components that store, retrieve, and process information intelligently, like retrieval-augmented generation (RAG). 
  • Interface systems that connect agents to users and other enterprise systems. 
  • Action systems and actuators that translate decisions into operations through APIs, workflow automation, physical systems, etc. 
  • Data architecture that provides well-structured, contextualized data repositories. 

2. Risk governance. EGI requires guardrails that define appropriate autonomy levels across business functions. Savarese noted that sophisticated organizations are moving beyond binary “human-in-the-loop” models to “human-at-the-helm” frameworks in which oversight intensity varies based on context, confidence, and consequence. 

3. Skills development. EGI necessitates training employees to collaborate effectively with AI systems and develop understanding of AI capabilities and appropriate use cases. Successful organizations will build cross-functional teams that combine domain expertise with AI literacy, and they will establish feedback mechanisms for continuous system improvement. 

Additional tools for achieving EGI

As part of its goal of helping customers realize EGI, Salesforce AI Research also announced:

  • An upgrade to action model capabilities. The organization has upgraded the xLAM family with multi-turn conversation support and a wider range of smaller models for increased accessibility. This family of models is designed to predict actions.
  • A multimodal action model family for multi-step problem solving. TACO is a new multimodal action model family that generates chains of thought-and-action (CoTA) to break tasks down into simple steps while integrating real-time action.
  • Enhanced embedding model capabilities. Several days ago, Salesforce AI Research unveiled SFR-Embedding, an advanced text-embedding model that can convert text to structured data for better AI information retrieval. SFR-Embedding will soon be available in Salesforce Data Cloud.
  • Specialized code embedding models for developers. SFR-Embedding-Code is a specialized code embedding model family based on SFR-Embedding. It can map code and text to a shared space for high-quality code search.
  • A framework for testing and evaluating AI agents. CRMArena is a novel benchmarking framework leveraging CRM scenarios.
  • Agent guardrail features. SFR-Guard is a new family of guardrails trained on publicly available data and CRM-specialized internal data to enhance the trust and reliability of AI agents.
  • A benchmark for assessing models in contextual settings. ContextualJudgeBench is a new benchmark for evaluating LLM-based judge models in context. It assesses accuracy, conciseness, faithfulness, and appropriate refusal to answer by testing more than 2,000 challenging response pairs.


Read More from This Article: Salesforce wants your AI agents to achieve ‘enterprise general intelligence’
Source: News

Category: NewsMay 1, 2025
Tags: art

Post navigation

PreviousPrevious post:Accelerating secure innovation: How Marc Crudgington is redefining cybersecurity for the modern enterpriseNextNext post:How I digitally twinned myself – and why you should, too

Related posts

AWS-SAP, ‘공동 AI 혁신 프로그램’ 발표··· “기업 맞춤형 AI 개발 지원”
May 22, 2025
2025년 CIO 어젠더를 정의하는 5가지 질문
May 22, 2025
애플 디자인 철학, AI로 이어질까···오픈AI, 조니 아이브 기업 ‘IO’ 인수
May 22, 2025
PwCのCITO(最高情報技術責任者)が語る「CIOの魅力」とは
May 21, 2025
M&S says it will respond to April cyberattack by accelerating digital transformation plans
May 21, 2025
AI and load balancing
May 21, 2025
Recent Posts
  • AWS-SAP, ‘공동 AI 혁신 프로그램’ 발표··· “기업 맞춤형 AI 개발 지원”
  • 2025년 CIO 어젠더를 정의하는 5가지 질문
  • 애플 디자인 철학, AI로 이어질까···오픈AI, 조니 아이브 기업 ‘IO’ 인수
  • PwCのCITO(最高情報技術責任者)が語る「CIOの魅力」とは
  • M&S says it will respond to April cyberattack by accelerating digital transformation plans
Recent Comments
    Archives
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.