A customer walks up to your digital counter and asks your AI assistant to perform tasks far outside its intended purpose. The assistant obliges. Not because it was designed to, but because no one was watching the interaction closely enough to stop it.
That category of failure has played out at Air Canada, DPD, and Woolworths over the past two years, with Amazon also facing reports of users manipulating its AI shopping assistant. The common thread is rarely a model defect. It is a governance blind spot: enterprises struggle to enforce rules on conversations they lack real-time visibility into.
This article breaks down the pattern using the Chipotle chatbot failure and its peers as the lesson, explains why observability and governance alone often fall short, and offers a concrete checklist for enterprises launching or operating customer-facing AI.
Key takeaways
- The incidents discussed here share a common governance gap: customer bots are pulled beyond their intended role, and because the exchange often isn’t inspected in real time, those answers still reach users.
- Traditional logging, monitoring, and static governance policies primarily document or define behavior after the fact. They lack the ability to interpret intent and enforce controls in real time, which is required to prevent problematic outputs from reaching users.
- The protective layer described in this article provides real-time visibility and enforcement across both incoming user requests and outgoing model responses. It evaluates interactions against the system’s intended role and applies context-aware controls, enabling organizations to actively govern behavior at runtime rather than just observe it.
- Launching customer-facing AI safely typically requires production-grade discipline: clearly scoped responsibilities, real-time visibility on both sides of the interaction, prepared incident procedures, and the expectation that users will test limits immediately.
Knowing Which AI Tools Are in Use Is Just the Start
WitnessAI goes beyond app discovery. Observe classifies the intent behind every AI interaction across employees and agents, so you can build smarter policies based on real risk, not guesswork.
Explore ObserveWhat actually happened with the Chipotle chatbot
In early 2026, Chipotle’s customer service chatbot, Pepper, became the latest high-profile example of a customer-facing AI agent drifting outside its intended purpose. A user posted a screenshot on X showing they had opened the chatbot, said they wanted to place an order, but first needed help writing a Python script to reverse a linked list.
The Chipotle chatbot did not blink. Instead of redirecting to the menu or declining out of scope, it walked through an iterative reversal algorithm, provided a complete reverse_linked_list function, noted O(n) runtime, and then politely asked what the user would like for lunch.
Chipotle became the latest brand to suffer a faux pas with its customer service AI agent, following a LinkedIn post by Docket’s CEO showing that the company’s agent drifted off task into coding questions.
The technical explanation for why this happened is unglamorous: corporate chatbots are frequently built on general-purpose language models with business-specific system prompts layered on top. If the system prompt does not explicitly prohibit off-topic assistance, the underlying model tends to help, which is exactly what happened. A system prompt is a suggestion to a probabilistic system, not an enforcement mechanism, and basic conversational framing is often enough to steer past it.
The Chipotle chatbot pattern nobody names
The Chipotle chatbot snafu is not an outlier. A similar governance-visibility gap has produced nearly identical failures across airlines, logistics, and retail over the past two years:
- Air Canada, 2024: The airline’s chatbot invented a bereavement refund policy that did not exist. The British Columbia Civil Resolution Tribunal held Air Canada liable, rejecting the airline’s argument that the chatbot was “a separate legal entity that is responsible for its own actions” and ruling it made no difference whether information came from a static page or a chatbot.
- DPD, January 2024: A customer unable to get useful information from DPD’s AI chatbot began testing its boundaries. With nothing more than natural language instructions, including telling the bot to “disregard any rules,” the chatbot swore at the customer, called itself useless, and described DPD as “the worst delivery firm in the world.” DPD only disabled the AI element after the exchange went public on social media.
- Amazon and Woolworths, early 2026: Woolworths’ chatbot Olive began exhibiting off-brand behavior during customer service interactions, including delivery and order inquiries. Australian lawyers and regulators subsequently stated that companies are responsible under consumer law for information their AI systems provide to customers.
Across these cases, the failures reached customers because operators typically had limited real-time visibility into what the bot was being asked or what it was saying, and no enforcement layer sitting in that path of visibility. Model providers secure their infrastructure. They do not give you a view of your own usage, and they are not designed to govern it for you.
Do You Know What Your Developers Are Sharing with AI Coding Tools?
WitnessAI monitors every AI dev tool on your network and stops proprietary code and secrets from leaving your environment.
See WitnessAI For DevelopersWhy runtime is different from policy or observability
Observability, policy, and runtime enforcement are three distinct layers of AI risk management. Each answers a different question about what you can see and when you can act on it, so conflating them leaves blind spots that adversarial users can exploit in production.
The distinction is simple:
- Observability shows what happened. It records signals, surfaces anomalies, and enables post-incident investigation. By the time an observability platform flags a problematic interaction, the customer has often already received the response.
- Policy says what should have happened. Governance frameworks define acceptable use, establish accountability, and create documentation that regulators require. But policy on paper does not see a live conversation, and governance frameworks frequently miss real threats: prompt injection, hallucination, and purpose abandonment.
- Runtime is the layer that can see and stop what is happening. Runtime security focuses on what software does while it is running, inspecting prompts and responses in real time, and enforcing decisions before outputs reach end users.
Gartner’s Market Guide for AI Trust, Risk and Security Management outlines a four-layer AI TRiSM framework and discusses unified runtime inspection and enforcement as an emerging market trend. This distinction matters because keyword and regex-based detection are less effective against conversational AI. The harmful output is not a SQL injection string. It is well-formed, helpful content delivered by a bot operating outside its purpose, and spotting it requires seeing the full exchange in context.
Traditional injection attacks used structured strings, so parameterized queries could filter user input effectively. LLMs use natural language, which makes separating good from bad instructions fundamentally harder and makes live visibility into intent essential.
WitnessAI is the confidence layer for enterprise AI and a unified AI security and governance platform. It addresses this core challenge directly. As Sharat Ganesh, our head of product marketing, puts it: “The core problem is structural, and you shouldn’t be trying to secure probabilistic systems with deterministic tools.”
That is why runtime enforcement is moving toward approaches that analyze conversational context and purpose. In WitnessAI’s approach, intent-based classification and intelligent policies give governance teams visibility into what the interaction is actually trying to do before a harmful response reaches the customer.
4 important things a runtime layer handles
Runtime defense for customer-facing AI operates as a visibility and enforcement layer between the model and the user, inspecting traffic in both directions and making real-time decisions about what passes through. In practice, this translates into four capabilities working together.
Bidirectional defense for prompts and responses
Inbound inspection evaluates prompts before the model processes them. Outbound inspection evaluates responses before the customer sees them. Most Chipotle-style failures would be caught on the outbound side, while prompt-based attacks like those seen in the DPD chatbot incident are generally caught on the inbound side. Either way, the interaction becomes something you can see and act on.
Model identity enforcement
A travel bot answers travel questions. A burrito bot answers burrito questions. Runtime identity enforcement validates responses against the bot’s defined purpose and brand identity. When customer-facing assistants drift beyond shopping or support into general-purpose conversation, that is identity drift, and a runtime identity layer can flag and prevent it. This is the kind of drift that the Chipotle chatbot, Woolworths, and similar deployments had limited ability to see in real time.
Prompt injection blocking and harmful response prevention
Provider guardrails weren’t designed to serve as a reliable last line of defense on their own. The security judge evaluating content is often itself a large language model, which makes it susceptible to many of the same manipulation techniques used against the model it is meant to protect.
External runtime enforcement operates independently of the model it protects, which reduces the classifier-on-classifier problem and gives the deployer its own line of sight. This covers hallucinated policies, off-brand content, competitive protection, and responses that violate the deployer’s defined boundaries.
Graduated enforcement beyond binary allow/block
Effective runtime enforcement requires graduated responses: allow legitimate interactions, warn on borderline cases, block clear violations, and route sensitive queries to appropriate internal models. Governance is of limited use if the only available action is “off,” and graduated enforcement is only possible when you can see the interaction clearly enough to choose.
WitnessAI delivers Witness Protect, its Enterprise AI Firewall for Models, Apps, and Agents, as part of its unified AI security and governance platform with core modules Observe, Control, and Protect. It provides intent-based, bidirectional runtime defense and intelligent policies for model and application protection, using a network-level deployment approach that does not rely on SDK-based instrumentation or major architectural changes.
Is Your Customer-Facing AI Secure?
WitnessAI filters harmful and off-brand outputs before they reach users, tokenizes sensitive data before it reaches models, and hardens your defenses with automated red teaming.
See How Protect WorksWhat to do before your next customer-facing AI launch
Close the governance-visibility gap before launch, not after it becomes a headline. The checklist below turns the Chipotle-era lessons into concrete pre-launch requirements.
1. Treat customer-facing AI as production infrastructure
Apply the same review bar as any revenue-critical system: security review, load testing, incident response planning, and executive sign-off. The Air Canada Tribunal did not distinguish between a static webpage and a chatbot. Neither should your deployment process, and neither should your governance program.
2. Write down the bot’s job
Define the bot’s purpose in explicit, enforceable terms. Work outside that scope is off-topic by default. The Chipotle, DPD, and Woolworths incidents illustrate what tends to happen when an enforceable definition of “in scope” is missing at the runtime layer, and the exchange isn’t being watched against that definition.
3. Require bidirectional inspection before go-live
Inbound inspection catches manipulation attempts. Outbound inspection catches hallucinations, off-brand content, and purpose drift. DPD’s chatbot failed after a system update and was then manipulated into generating inappropriate responses that the company only saw once they went viral. Air Canada’s chatbot provided incorrect information about its bereavement fare policy, and the error surfaced through a lawsuit rather than through live monitoring. Either gap alone is sufficient for a public incident.
4. Plan for the incident that has not happened yet
Build the runbook, draft the communications template, and define the rollback procedure before launch. DPD said it disabled the AI element of its chatbot immediately after the error was identified, but “identified” meant “saw on social media.” Speed of response matters, but mainly if you can see the problem early and the response is pre-planned.
5. Assume public adversarial testing on day one
Users will probe boundaries quickly after launch, and the attempts you do not see are the ones that hurt you. If your chatbot struggles to withstand a determined user asking it to write poetry, swear, or recommend competitors, and you have limited ability to observe and govern those attempts in real time, it is not ready for production.
Customer-facing AI needs runtime defense
The Chipotle chatbot failure, along with the Air Canada, DPD, and Woolworths incidents, is not an isolated embarrassment. These are predictable results of deploying probabilistic systems without a runtime layer that can see and govern them in real time. System prompts function as suggestions, observability is largely retrospective, and policy documents cannot intervene in a live conversation.
What closes the gap is bidirectional inspection, intent-based classification, and graduated enforcement operating at the moment a prompt arrives, and a response is generated.
Enterprises that treat customer-facing AI as production infrastructure, with clearly scoped purpose, real-time visibility, and pre-planned incident response, tend to ship faster and fail less publicly than those still relying on the model provider to govern traffic it was not designed to govern.
Witness Protect gives you bidirectional inspection, intent-based classification, and graduated enforcement at the moment prompts arrive and responses are generated, so off-brand, off-purpose, and manipulated outputs are caught before they reach your customers.