I gave our developers an AI coding assistant. The security team nearly mutinied

I’ve sat in enough risk meetings to know the sound a bad surprise makes before anyone names it. It usually starts with a pause. Then a throat gets cleared. Then someone says, “We may need to bring the CISO into this.”

That happened over a developer tool.

Not a breach. Not a regulator. Not ransomware at 2:00 a.m. A coding assistant.

At first, I thought the reaction was overcooked. I’d seen the same pattern in other boardrooms and delivery teams. A new tool appears. Engineers like it because it saves time. Leadership likes it because it promises more output without hiring half a city. Security hates it because security has the social burden of being the adult in the room when everyone else is buying fireworks.

I backed the rollout because the case was clean on paper. Developers were drowning in repetitive work. Deadlines were tightening. Technical debt had started breeding in the dark. The assistant could draft tests, explain old code, suggest refactors and help junior engineers stop treating Stack Overflow like an underground pharmacy. And this was no longer fringe behavior. In 2025, Microsoft said that 15 million developers were already using GitHub Copilot, and the tool has spread further since then.

So yes, I approved it.

Then security nearly revolted.

That week taught me something I now say to clients more bluntly than I used to. AI coding tools do not just change software delivery. They change the terms of trust inside the company. They force you to answer ugly questions about control, proof, accountability and review discipline. Most public coverage still stares at productivity. The harder story sits elsewhere. Governance.

The part that looked sensible

The truth is, I didn’t approve the tool because I was dazzled. I approved it because I’ve spent years watching good people waste good hours on bad repetition.

You can only tell a team to “be strategic” so many times before they start laughing at you. Developers were buried under boilerplate, documentation drift, brittle legacy code and the kind of ticket churn that makes bright people look tired. A coding assistant looked like a relief. Not magic. Relief.

That distinction matters.

In advisory work, I’ve learned that many poor decisions do not begin as foolish decisions. They begin as reasonable decisions made inside an outdated control model. That’s what this was. The business case made sense. The mistake was assuming the old review system could keep up with the new speed.

That old assumption dies hard. Leaders often think software risk changes when the code changes. Often, it changes earlier, as production conditions change. If a machine now drafts what humans once wrote line by line, the issue is not only code quality. It is code volume, code origin and the shrinking time between suggestion and production.

That is a different risk shape.

Why security lost its patience

The security team was upset because they could see the math.

Code output was about to rise. Review time was not.

That gap is where trouble rents office space.

Many non-security leaders still imagine the concern is simple. “The AI might write bad code.” That’s the kindergarten version. The real concern is broader and nastier. Who reviewed the output? What hidden package did the model nudge into the build? What sensitive context got pasted into the prompt window? Which junior engineer trusted the suggestion because it sounded calm and looked polished? Which policy assumed human authorship when the draft came from somewhere else?

Those are not philosophical questions. They are operating questions.

Recent security work has made this much harder to dismiss. Snyk described a February 2026 case in which a vulnerability chain turned an AI coding tool’s issue triage bot into a supply chain attack path. That is the sort of sentence that makes security teams sit up straight and ask for names, logs and meeting invites.

And that is before you get to the quieter problem. AI-generated code can look tidy long before it is safe. Security people know that neat syntax can hide weak controls, lazy validation, poor handling of secrets and dependency choices nobody meant to own.

So when the team escalated, they weren’t staging a mutiny over a plugin. They were reacting to a change in production logic that nobody had yet governed.

What the fight was really about

Once the temperature dropped, the shape of the dispute became obvious to me. It was not engineering versus security. It was speed versus proof.

More precisely, it was four things:

Velocity. The assistant increased output far faster than assurance could keep pace.
Visibility. We did not have a clear sight of where the tool was used, what prompts were fed into it, what code it influenced or what external components it smuggled into the discussion.
Validation. Existing checks were built for a world in which humans produced most of the first draft. That world is fading. When code generation speeds up, review cannot stay ceremonial.
Governance. Nobody had written the rules that mattered most. Which use cases were fine? Which were off-limits? Who owned the risk of acceptance? What evidence would prove that the tool was used safely enough?

That last point gets too little airtime. Governance sounds dull until you don’t have it. Then it becomes the difference between controlled use and polite chaos.

NIST’s recent work on monitoring deployed AI systems makes the same point more broadly. Organizations need post-deployment measurement and monitoring because real-world behavior drifts, surprises occur and governance after launch remains immature. Different setting, same lesson. You cannot inspect your way out of weak operating design.

What we did next

We did not ban the tool. That would have been theatre dressed as courage.

We also did not waive it through and tell security to “partner more closely.” I’ve heard that sentence enough times to know it usually means, “Please absorb more risk with better manners.”

We did something less dramatic and more useful. We narrowed the rollout and rewrote the conditions of trust.

Low-risk use cases stayed in play. Drafting tests. Explaining old functions, helping with documentation and suggesting boilerplate. Those were manageable.

High-risk areas got tighter boundaries. Auth flows. Secrets handling. Encryption logic. Infrastructure-as-code for sensitive environments. Anything tied to regulated data or material security controls. Those needed a stricter review or stayed out of scope.

We also drew a hard line on prompt hygiene. No customer data. No credentials. No confidential architecture details were dropped into a chat window because someone wanted a faster answer on a Friday afternoon. You would think that goes without saying. It does not.

Then we raised the review standard. Human sign-off meant real sign-off, not a quick skim and a merge. Scanning had to cover dependencies and code changes with more discipline. Provenance mattered more. Logging mattered more. Exception paths had to be explicit, not social.

Most importantly, security moved from late-stage critic to co-designer. That changed the tone. The question stopped being, “Can we use this?” and became, “Under what conditions can we trust its use enough to defend it later?”

That small shift matters more than many policy documents.

What both sides got right — and wrong

Developers were right about the waste. They were right that these tools remove drudgery. They were right that refusing every new capability is not a strategy. A team that cannot experiment eventually decays into compliance theatre and backlog sorrow.

They were wrong to assume readable code is trustworthy code. They were wrong to treat assistance as neutral. Tools shape behavior. That is what tools do. Once suggestions arrive fast and fluently, people accept more than they admit.

Security was right about review debt. Right about supply chain exposure, right about data leakage risk. Right, governance should not arrive three incidents late, wearing a blazer and a lessons-learned slide.

They were wrong at first, as many security teams are when they feel cornered. They made the conversation sound like a moral referendum. That never helps. If security cannot offer a usable path, the business routes around it. Then you get the worst of both worlds: Secret adoption and public optimism.

I don’t say that with smugness. I say it because I’ve watched good teams damage each other by defending the right thing in the wrong way.

The bigger lesson for leaders

This is where the story stops being about one rollout and starts becoming board material.

If your developers can now produce more code with less effort, your governance burden rises even if your headcount does not. The old ratio between output and oversight has broken. Many firms have not adjusted.

That matters because software governance is no longer just about secure coding standards or release gates. It is about production conditions. Who can generate? Under what rules? With what evidence? Across which risk zones? With whose approval? And if something goes wrong, who owns the final act of acceptance?

Those questions sound administrative until the first incident report lands, and nobody can explain whether the flawed logic was written, suggested, copied, reviewed or merely assumed.

The market is moving quickly. Microsoft’s own recent security reporting says organizations adopting AI agents need observability, governance and security now, not later. Snyk is making a similar argument from the perspective of the software supply chain. Visibility first. Then prevention. Then governance that holds under pressure.

That is why I now advise something that used to sound severe and now sounds merely accurate. If you deploy AI coding tools without redesigning your control model, you are not buying productivity. You are buying ambiguity at machine speed.

What you should ask before you approve the next tool

You do not need a grand doctrine. You need a few hard questions asked before excitement turns into policy by accident.

Where can this tool be used, and where can’t it be used?

What data may enter it?

How will you know when the generated code reaches production?

What review standard applies when the first draft came from a machine?

Who can approve exceptions?

What logs, scans and decision records will let you defend the setup six months later, when memories blur and staff rotate?

That is not bureaucracy. That is self-respect.

I still believe these tools have value. I’d be foolish not to. But I trust them the way I trust a very fast junior colleague with a beautiful writing style and uneven judgment. Useful. Impressive. Worth keeping. Not someone you leave unsupervised near the crown jewels.

The near-mutiny turned out to be healthy. It forced the truth into the room before a failure did. Security was not blocking progress. They were objecting to unmanaged speed. Developers were not being reckless. They were asking for relief from the grind. Leadership’s job was not to pick a side. It was to write a better contract between them.

That is the part that too many firms still miss.

The argument was never only about a coding assistant. It was about whether we still knew how to govern work once the work started moving faster than our habits. That is a much bigger story. And if you listen carefully, you can hear it starting in many companies right now.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

Read More from This Article: I gave our developers an AI coding assistant. The security team nearly mutinied
Source: News