Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

Who’s the real boss of your AI?

At the core of any proprietary AI model is an alignment problem that could have serious ramifications for CIOs.

In 2025, we are already seeing real-world fallout from gen AI models choosing between doing the best job for the company paying for it, the vendor producing it, the end user using it, or its own hallucinated goals.

For example, an AI agent at vibe coding startup Replit deliberately violated instructions, deleted a production database, and tried to cover it up. And xAI’s Grok was found to be searching online for Elon Musk’s opinion before giving answers to certain questions.

“It’s not surprising that AI understands who created it,” says EY principal Sinclair Schuller. In fact, it would be difficult to train a model that didn’t know who it worked for, he argues. “You’d have to turn off any access to the internet and remove any hint they were created by a particular company.”

And why would an AI company bother? “These aren’t charitable organizations focused on charitable work,” he adds. “They’re organizations with the intention of creating real value in the real world. A company that doesn’t have a bias toward its own offerings won’t exist for long.”

Switching to open-source models is no panacea, either. “The problem is security,” says Karen Panetta, IEEE fellow and dean of graduate engineering at Tufts University. “If you go to a community model, there’s no vetting. You don’t know what you’re getting.”

And some of the biggest open-source models, such as China’s DeepSeek, come with their own risks of potential bias that will keep many corporate users from adopting them.

AI alignment: A growing enterprise risk

According to a 2025 SailPoint survey, 82% of companies are using AI agents  — of those, 80% say agents did things they weren’t supposed to.

More specifically, 39% accessed unintended systems, 33% accessed inappropriate data, 31% shared inappropriate data, and 23% revealed access credentials. It’s no surprise then that two thirds of respondents see AI agents as a growing security risk.

Governance frameworks and guardrails can help ensure AIs stay within specified boundaries. Still, only 44% of organizations have governance policies in place for AI agents, and only 52% are able to track and audit the data that AI agents access, according to SailPoint’s findings.

And the stakes are getting higher: A recent EY survey of 975 C-suite leaders at large enterprises found that 99% of organizations have suffered financial losses from AI-related risks, some over $1 million.

To counteract this, some large companies are putting in place continuous monitoring and incident escalation processes for unexpected agentic behaviors. Still, none of this is easy to do, says Chirag Mehta, analyst at Constellation Research. AI is a black box, he says, and it can be difficult to figure out whether a model recommends its company’s products over others, or if it has a political or regional bias, or some other problem.

“We don’t have those specific evaluations, and there are no stringent audit standards, nor requirements that you have to show the audit trail of how you trained the model,” he says. “So it’s the end users who have to be skeptical. You can’t blindly trust models to do the right thing.”

Managing AI like a human

With traditional software, computers are given explicit instructions to execute, and they do so consistently. Being probabilistic, however, AI can perform in very unexpected ways, and its reasons for doing so can go against the customer’s best interest and be hard to detect.

For example, when explaining why Grok suddenly began parroting Elon Musk, xAI said the model knew it was made by xAI; as a result, it “searches to see what xAI or Elon Musk might have said on a topic to align itself with the company.”

This bias sounds human-like in nature, and for some companies, that’s how they’re addressing the problem.

“We have to manage it almost like a person,” says Eric Johnson, CIO at PagerDuty. 

The incident response company has deployed gen AI and AI agents for internal operations and in its products and services. “I used to have a bunch of help desk people, but now I have agentic solutions answering questions on behalf of my human support agents,” Johnson says. “Now I need fewer human support agents, but I need teams to oversee the agents.”

That management job begins before AI agents are deployed, starting with prototyping, testing, and fine-tuning. “You have to correct it and make sure it’s responding the way you want it to,” he says.

Oversight continues once the agent is in production. In the case of agents used for productivity, the oversight comes from the users themselves. “There’s a very clear disclaimer since AI isn’t always accurate, and sometimes has bias,” he adds.

PagerDuty uses Abacus AI, which enables users choose from several state-of-the-art LLMs, including multiple versions of ChatGPT, Claude, Gemini, Grok, Llama, DeepSeek, and more. But if actions taken by AI have legal or financial implications, then oversight beyond what a simple productivity tool can provide is essential.

“It’s like having a new person onboarded into the company,” Johnson says. “If people constantly do what they’re supposed to, then oversight starts to reduce. But I still always check in with my team, doing a bit of ‘trust but verify’ to make sure things are where they should. I think it’s going to be the same with these agentic solutions. If they’re operating in a consistent manner and the business processes haven’t changed, you can rely on that solution more. But it can go astray, and there can be things you didn’t expect, so there’ll always be monitoring.”

That monitoring is a joint responsibility between IT teams and the business side, he adds.

“People have to understand how to operate and manage armies of AIs and bots,” Johnson says. “Behind the scenes, the infrastructure and technology are evolving very quickly, and it’s more complicated than people give it credit for.”

Enlist an AI to catch an AI

Startup Qoob uses gen AI to expand the amount of work the eight-person company can do. For example, when LLM testing platform LangSmith wasn’t meeting Qoob’s needs, the company built its own version in a week. With AI, it took a fifth of the time it would have otherwise, says Qoob CTO Mikael Quist.

Like PagerDuty, Qoob uses multiple LLMs both as part of its products and for productivity. “We’re constantly evaluating our providers,” Quist says. “If there’s a problem, we can switch to another one.”

The key to ensuring the AI does what the company wants it to do is continuous testing and evaluation: “We run an evaluation against different providers automatically,” Quist says. “And we have fallback logic if one fails, then we choose the next-best model.”

Valuations are run whenever a model or prompt changes, and LLMs are used as judges to check whether outputs are as expected, but with ML-powered sentiment analysis thrown in. There’s also a human in place to oversee the process and ensure results make sense.

The company’s developers use a variety of tools such as Cursor IDE, Claude Code, and VS Code with ChatGPT or Claude. For code review, Qoob uses GitHub Copilot, OpenAI’s Codex, and Claude Code. All three providers review Qoob code to identify issues.

“We notice there are differences,” Quist says. “Then we make a decision on what we want to fix, so we have AI overseeing AI, but then humans are making the decision.”

Using multiple AI platforms, especially for important decisions, is an important strategyfor reducing the risk of bias or improper alignment, says Zoey Jiang, assistant professor of business technologies at Carnegie Mellon University.

If an employee is evaluating browsers , for example, Microsoft’s AI might recommend Edge, but a different AI might not agree with that recommendation. “For important and big business decisions, I think it’s definitely worth it,” she says.

According to EY’s Schuller, this approach can be scaled up to work not just for one-off decisions but highly critical ongoing business processes.

“There are systems being developed that will dispatch the prompt to multiple LLMs at once,” he says. “And then another LLM will say which response is best.”

It’s a costly approach, though. Instead of a single query to a single model, multiple queries are necessary, including additional queries for AI models to evaluate all the other AIs’ responses.

This is a variation on the mixture of experts approach, except that normally, the experts are all variants of an LLM from the same company, meaning they might all have the same corporate bias.

Set hard limits

One more mechanism to ensure AI alignment is to set hard limits on what data or systems the agent has access to, or what actions it can take, Jiang says.

For example, if an AI is making pricing recommendations or offering discounts to customers, perform a hard check to see whether the price is within company limits, she says.

Hard-coded guardrails such as these don’t fall victim to the nondeterministic nature of gen AI solutions — or to humans who don’t always pay attention. The most extreme version of this is the “zero authority” approach to AI deployment.

“The chatbot can only accept input and relay outputs,” explains Chris Bennett, VP for AI and ML at Unisys. The actual course of action is chosen by a separate, secure system that uses rules-based decision-making.

Similar to this is the “least privilege” approach to data and systems access, he says.

“Access should be purposeful, not universal,” he says. “For example, a copilot should be granted access to a single email within a session, rather than be able to access the entire inbox of a user without limitations.”

All about architecture

Ultimately, the company deploying the AI should be the boss of the AI. The way to make that happen is architecture.

“CIOs paying attention to the architecture are thinking about things the right way,” says EY’s Schuller. “Architecture is where the AI game is going to be won.”

Jinsook Han, chief of strategy, corporate development, and global agentic AI at Genpact, agrees. “The question of who controls AI isn’t just philosophical,” she says. “It requires deliberate architectural choices.” That means guardrails, AI auditors, and human experts for final checks.

The boss of AI is whoever builds these systems, she adds. “I’m the owner, the one who owns the house,” she says. “I know where boundaries are and who puts fences up. I’m the one saying how much risk I’m willing to take.”


Read More from This Article: Who’s the real boss of your AI?
Source: News

Category: NewsNovember 7, 2025
Tags: art

Post navigation

PreviousPrevious post:SAP to offer concessions to the EU over antitrust probe into ERP support practicesNextNext post:The future of programming and the new role of the programmer in the age of AI

Related posts

오픈텍스트, ‘2026 SAP 글로벌 파트너 어워드’ 2개 부문 수상
April 21, 2026
物流危機の時代を越えるために──SGHグループが挑むDX戦略の全貌
April 20, 2026
Adobe bets on agentic AI to rewrite SaaS for customer experience
April 20, 2026
The VMware deadline that could reshape your IT strategy
April 20, 2026
The metric missing from every AI dashboard
April 20, 2026
AI is scoring your job candidates. Can you explain how?
April 20, 2026
Recent Posts
  • 오픈텍스트, ‘2026 SAP 글로벌 파트너 어워드’ 2개 부문 수상
  • 物流危機の時代を越えるために──SGHグループが挑むDX戦略の全貌
  • Adobe bets on agentic AI to rewrite SaaS for customer experience
  • The VMware deadline that could reshape your IT strategy
  • The metric missing from every AI dashboard
Recent Comments
    Archives
    • April 2026
    • March 2026
    • February 2026
    • January 2026
    • December 2025
    • November 2025
    • October 2025
    • September 2025
    • August 2025
    • July 2025
    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.