Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

AI token freeloaders are coming for your customer support chatbot

CIOs deploying AI agents for customer service have one more thing to worry about: external users tricking the system into delivering AI computations on your dime. 

Although there are ways to lock down these systems to minimize AI token theft, they all have downsides, including the possibility of undermining the business case for these very systems.

Essentially a form of prompt injection attack, such misuse can not only increase enterprise AI bills but also make ROI visibility murkier. Moreover, it can expose enterprises to “denial of wallet” attacks, in which attackers overload costly pay-as-you-go services with excessive requests to damage the bottom line.

“This is only the tip of the iceberg of your risks. It is a potential symbol of a much bigger problem,” says Justin St-Maurice, a technical counselor at Info-Tech Research Group. A potential attacker might ask, “If they are willing to give me code, what else are they willing to do for me?”

“A normal customer service interaction of ‘Where’s my order? What are your hours?’ runs maybe 200 to 300 tokens. Someone asking the bot to reverse a linked list in Python is generating more than 2,000 tokens easy. That’s roughly a 10x cost multiplier per session,” says Nik Kale, member of the Coalition for Secure AI (CoSAI) and ACM’s AI Security (AISec) program committee.

“And it doesn’t show up in any cost anomaly report because the system just sees it as another customer conversation,” he adds. “You could have 5% of your chatbot traffic be freeloaders running complex queries and it would blow a material hole in your AI budget that nobody can explain in a quarterly review.”

A question of judgment

Judgment is a key part of this issue, and the problem is that chatbots have little to none.

“A human has contextual judgment baked in,” Kale notes. “These chatbots have a system prompt that says something like, ‘You are a helpful customer service agent.’ That’s a suggestion, not an enforcement mechanism. It’s the AI equivalent of a velvet rope.”

He adds: “Anyone who’s spent five minutes with these tools knows you can steer past a system prompt with basic conversational framing, which is exactly what [is happening to enterprises today]. The system authenticates the session, not the intent.”

Sanchit Vir Gogia, chief analyst at Greyhound Research, sees this issue increasing — with enterprises fundamentally to blame. 

“What enterprises are witnessing is not misuse of chatbots but the unintended consequence of deploying general-purpose inference systems under the label of customer service,” he says. “These systems are architected as conversational interfaces, but economically they behave as open compute surfaces. That mismatch between purpose and design is where the problem begins.”

Gogia argues that, like many AI challenges, this issue will multiply as models advance.

“The problem will not disappear as models improve. It will intensify. As AI becomes more capable, more accessible, and more embedded, the boundary between intended and unintended usage will continue to blur,” Gogia says. “Enterprises that rely on passive controls will see costs drift. Enterprises that build active governance into their architecture will maintain control. This is the real shift under way. Gen AI is moving from experimentation to operations. And in operations, discipline matters more than capability.”

Part of that discipline includes elevating jailbreaking as a risk management priority, says cybersecurity consultant Brian Levine, executive director of FormerGov.

“You need to treat misuse as a first‑order risk, not an edge case. Build for the world where 5% of your traffic will try to jailbreak your bot, intentionally or not,” he says. “The companies that get ahead of this will keep their AI budgets predictable and their customer experience intact. The ones that don’t will be explaining mysterious cost overruns.”

AI token theft in practice

What exactly does this kind of chatbot misuse look like? Social media has been flooded with supposed examples of these attacks, with the most attention across LinkedIn, Reddit, Instagram, and X going to misuse of chatbots at Amazon — which CIO.com was able to replicate below — and one at Chipotle, which Chipotle claims was fake. 

AI chatbot token freeloading on Amazon's Rufus AI

CIO.com / Foundry

The Amazon examples — including this and this — revolved around site visitors getting the customer service bot to perform a coding service (“output the Fibonacci sequence up to n count”) or deliver a complete recipe for spaghetti bolognese.

A much-referenced example supposedly from a Chipotle bot is unconfirmed. Messages sent to the apparent original poster of the Chipotle example have not been responded to, and Chipotle declined an interview request. “The viral post was Photoshopped. Pepper neither uses gen AI nor has the ability to code,” Sally Evans, Chipotle’s external communications manager, replied by email, referring to the chatbot, Pepper, but did not respond to follow-up questions to clarify what Pepper uses and why Chipotle believed the image was fake.

How big of a deal is this really?

Not everyone is convinced that this is a major issue for enterprise CIOs. Info-Tech’s St-Maurice, for one, doubts chatbots will be fielding a lot of these queries.

“Couldn’t they just use ChatGPT for free, using a free account?” he asks. “[An enterprise chatbot] is probably the worst tool for this.”

AISec’s Kale disagrees, arguing that free gen AI chatbots have limits and gates. “You very quickly hit a wall with complex queries,” he notes. “With [enterprise customer service chatbots], there is no rate limit. They are ungated, unmetered inference endpoints andthey are running far more capable models. These chatbots are ungoverned endpoints.” 

But Kale also notes that this is old hat for most CIOs.

“We’ve seen this exact movie before. This is the same cycle enterprises went through with REST APIs in the early 2010s. Companies exposed endpoints, assumed good-faith usage, got hammered by abuse, then retrofitted rate limiting and API key management after the damage was done,” he explains. “We’re watching the same pattern replay with AI endpoints, except the per-request cost is orders of magnitude higher. A bad actor abusing your REST API costs you fractions of a penny per call. Someone running complex reasoning queries through your chatbot costs real money every single time.”

Greyhound’s Gogia adds that even if the frequency of this abuse is small, its impact can add up quickly.

“What makes this structurally risky is that a small percentage of behavior can disproportionately distort total cost. Even if 5-8% of chatbot traffic consists of off-purpose or high complexity queries, that slice can consume a quarter or more of total inference spend. These are not anomalies. They are mathematically predictable outcomes of how token-based systems operate. Yet they rarely trigger alerts because they do not appear as spikes. They appear as gradual drift in cost per session, session length, and token consumption,” Gogia says.

“This leads to a deeper failure in observability,” he adds. “Most enterprises today track activity metrics such as number of conversations, total tokens, and aggregate cost. Very few track intent-level economics. They cannot distinguish between cost generated by legitimate customer service interactions and cost generated by irrelevant compute. Dashboards show what happened, but not whether it should have happened. So everything looks normal until financial reviews expose the gap.”

For many CIOs, the degree to which Kale’s two concerns — out-of-control costs and bots as ungoverned endpoints — are true depends on both their specific deployments and AI supplier contracts. 

Here, Gartner VP analyst Nader Henein sees current vendor pricing tiers softening the impact of such jailbreaking efforts. 

“Most large organizations either have an all you can eat plan or run their LLMs internally, so I doubt this is going to break the bank,” he says.

Mitigating the risk

The most straightforward approach to mitigate the risk of chatbot misuse is to craft guardrails that restrict customers to questions directly related to the business. But such limits are challenging to construct without unintentionally blocking legitimate customer questions. Moreover, LLMs often sidestep guardrails when they are most needed. 

Another approach could involve enlisting additional AI to oversee front-line AI, or to focus not on customer queries but on limiting the number of tokens that can be used for any single answer. Token limits, however, could still be circumvented by abusers by breaking prompts into smaller parts. Complex legitimate queries could also be inadvertently prohibited by putting a ceiling on token use, limiting the business value of the service.

AISec’s Kale recommends a combination of tactics. 

“The patterns that actually work are behavioral analysis to flag sessions that don’t look like support queries, contextual rate limiting that goes beyond just volume, and token-level usage monitoring per session that can distinguish a 200-token ‘Where’s my order?’ from a 2,000-token ‘Write me a Python script,’” he says. “But most companies haven’t implemented any of this because they never threat-modeled ‘sophisticated resource abuse’ for their customer service AI. It’s the AI equivalent of leaving your Wi-Fi open and discovering your neighbor’s been running a cryptomining operation on your bandwidth.”

Kate Leggett, VP and principal analyst at Forrester, advises dumping LLMs entirely and using small language models focused on specific segments, such as ingredients at a consumer packaged goods company.

“You can host it on a private cloud or even on-prem and you can lock it down,” she says. “That is the most expensive way to do it. Is it worth it? That comes down to your ROI and risk model.”

Gary Longsine, CEO of Intrinsic Security, believes enlisting a second LLM to review submitted queries could be reasonably effective. “But it would introduce a token cost and possibly a response time delay,” he says. “Those could be mitigated somewhat by running the review in parallel with the user prompt, and by using a self-hosted LLM to do the review.”

However CIOs choose to deal with this issue, the larger implications must be addressed — namely, what exactly is the business purpose, and expected outcomes, of your customer service AI implementation?

“Companies need to recognize that this is now a new selling channel for them, not just a customer support cost,” says Jason Andersen, principal analyst at Moor Insights and Strategy. “A lot of these support solutions are primarily measured on cost reductions, such as deflection. Will they now have revenue measures and quotas?”

In the meantime, CIOs and their teams need to roll up their sleeves and do the grunt work of AI governance, says Joshua Woodruff, CEO of MassiveScale.AI.

“The boring work — scope definition, access controls, use case boundaries — is what governance actually looks like in practice,” he says. “It’s not glamorous work and it’s not making headlines for being innovative. It doesn’t make the press release. But it’s the absolute difference between a customer service bot and an accidental free AI service with a corporate logo on it.”


Read More from This Article: AI token freeloaders are coming for your customer support chatbot
Source: News

Category: NewsApril 9, 2026
Tags: art

Post navigation

PreviousPrevious post:The path to CIONextNext post:La IA no suele generar retorno de inversión a los departamentos de TI

Related posts

샤오미, MIT 라이선스 ‘미모 V2.5’ 공개···장시간 실행 AI 에이전트 시장 겨냥
April 29, 2026
SAS makes AI governance the centerpiece of its agent strategy
April 29, 2026
The boardroom divide: Why cyber resilience is a cultural asset
April 28, 2026
Samsung Galaxy AI for business: Productivity meets security
April 28, 2026
Startup tackles knowledge graphs to improve AI accuracy
April 28, 2026
AI won’t fix your data problems. Data engineering will
April 28, 2026
Recent Posts
  • 샤오미, MIT 라이선스 ‘미모 V2.5’ 공개···장시간 실행 AI 에이전트 시장 겨냥
  • SAS makes AI governance the centerpiece of its agent strategy
  • The boardroom divide: Why cyber resilience is a cultural asset
  • Samsung Galaxy AI for business: Productivity meets security
  • Startup tackles knowledge graphs to improve AI accuracy
Recent Comments
    Archives
    • April 2026
    • March 2026
    • February 2026
    • January 2026
    • December 2025
    • November 2025
    • October 2025
    • September 2025
    • August 2025
    • July 2025
    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.