AI sealed the hood shut. Soon nobody will be able to fix code when it breaks

Since the internal combustion engine’s arrival, auto repair shops could fix just about anything for any make and model with basic tools. Then came modernization; fuel-injected engines took over.

In the same way fuel-injected engines replaced pistons, spark plugs and carburetors, ensuring nobody looks under the hood of their car, AI coding is replacing the serviceability of coding, not its existence. Now, nobody is looking into their web applications.

The risk of bad code appearing everywhere is one aspect of the problem. But the deeper risk is the loss of foundational skills in servicing and maintaining code. There are no field replaceable components when everything is a hackneyed Rube Goldberg machine of software functions and libraries that might work when deployed, but then become unmanageable and unmaintainable.

Apps run, but fewer people truly know why or how. As Pirsig explored in his book “Zen and the Art of Motorcycle Maintenance,” when we lose our relationship with the underlying machinery, we also lose our connection to quality itself. That loss of serviceability isn’t just inconvenient. It’s a new form of risk.

The evidence isn’t theoretical

Aikido’s 2026 State of AI in Security & Development shows one in five organizations has already suffered a serious incident from AI-generated code. Almost 70% have uncovered vulnerabilities introduced by AI assistants.

When AI introduces a flaw, nobody knows who owns the consequences. When asked who would ultimately be responsible for an AI-introduced breach, answers were split across engineering, security and the vendor as to who is meant to enforce PR gating; a clear sign that governance hasn’t caught up with automation.

Early-career engineers now work almost entirely at the abstraction layer. They ship code faster, but with far less exposure to systems, networks or failure modes. That weakens the human judgment needed to challenge AI output before it reaches production. It’s all after-the-fact and feels a bit like humans are in the loop only to be thrown under the bus.

AI-generated code is an amplifier of an organization’s core values and security culture. If the organization has good security DNA, AI tools will improve that culture. But if an organization lacks discipline and basic risk management, autonomous software development will reflect that immature security culture.

When the algorithm fails, who gets blamed?

Take a proprietary trading firm that starts an agentic trading experiment. When the algorithm fails, the firm loses money. Little finger pointing because it was an experiment using their own capital. Now let’s up the stakes.

A healthcare company gives advice via clinicians. More than one organization is looking to leverage LLMs as non-player characters (NPCs) whispering to medical professionals during patient intake discussions. You can be sure there will be finger-pointing when a patient suffers an unexpected outcome partly due to a non-deterministic algorithm. Who is accountable for injury or loss of life? The company that wrote the algorithm? The QA team? The medical professional?

I can envision a council of NPCs checking each other’s work. Each is based on a different model, with different training data and different hallucination potential. The number will need to be odd to avoid a draw. You can see such work emerging even now with things like Agent Arena.

But let’s never forget that early LLM documentation warned never to use LLMs in mental health programs. Their approach to problem-solving is to predict the most likely next token. When depression and suicide are discussed, models will, of course, suggest killing oneself as the most logical outcome.

In a well-known example, an AI coding agent reportedly deleted a production database. The agent apologized, recognizing it had destroyed months of work despite being in a code freeze. In general, QA and testing teams take the brunt of the blame. If we think of AI coding agents as junior developers prone to poor security, a manager decides to prioritize speed over quality.

The insurance industry isn’t ready

I’ve only heard nervous laughter when mentioning insurance for losses from agentic AI. But it makes sense for insurers to create coverage for negligence, IP infringement and regulatory liabilities. AI liability insurance will have to grow as incidents occur. The opposite is also happening, with large insurers seeking AI exclusions from existing coverage given the unpredictability of non-deterministic systems.

We’re familiar with IVR systems asking us to “press two for Spanish.” Now imagine systems asking to “press one to kill everybody.” The insurance industry won’t settle down until early copyright infringement cases are litigated. LLMs are essentially copyright infringement as a service.

The dumbing down of coding

But back to the root of this problem. Junior developers aren’t being replaced so much as their work is changing. Entry-level SOC analysts babysit algos deciding if log events are malicious. Marketing interns produce slide decks without graphics designers.

The term “slop” is applied to LLM outputs with decidedly pejorative intent. For the art world, this must feel like an affront. Art and design exist in constant dialogue with the past, reacting to prior movements with cultivated taste. Now visualizations are reduced to mundane aesthetics churned out effortlessly and devoid of respect.

We’re experiencing the software development equivalent of the same dumbing down. Someone wrote that we can be augmented by AI or abdicate to it. But abdication suggests stepping down from a throne of responsibility. Many AI coding users never attained that position in the first place.

The idea that a room full of monkeys with typewriters could produce the complete works of Shakespeare illustrates probability and inevitability. But it also illustrates that creativity and intentionality are essential elements of quality. If we deprive ourselves and future generations of those two traits, we’ve lost something significant.

Tool sprawl makes everything worse

Aikido’s research shows teams that suffered security incidents ran more vendor tools than those that didn’t. And yet the cycle keeps repeating; new security problems are invented to be solved. Some tools are the problem for your problem. Some cures are worse than the disease. Tool sprawl has been part of the security industry DNA historically since the advent of the firewall.

I tell my cybersecurity students something heretical: there are two kinds of security professionals. Certified and qualified. With a thin sliver of Venn diagram overlap. I prefer qualified talent. Many certified folks don’t understand how the internet works. Just remember that the first person to hand out a PhD didn’t have one.

There are also two kinds of CISOs. Pre-breach CISOs and post-breach CISOs. Pre-breach CISOs are about tools and software. Post-breach CISOs have learned all tools will fail eventually. People and processes are most important. Too many tools is a problem. AI-generated code exacerbates it.

The gap between policy and practice

A SOC2 auditor searches for the delta between written policy and practiced technique. If you publish a policy storing customer data on a world-readable S3 bucket and that’s what you do, then there are no qualms from the auditor. Why? SOC2 isn’t a certification. It’s an accountant’s opinion on whether you follow your policies.

You can get into hot water regarding your policies and procedures in two ways: perfect policies that are unenforceable or no written policies at all. Many companies will jump from the latter to the former and not understand why they have so many findings. So be wary about asking an LLM to write your policies, as they may often be “too good” or advanced compared with your actual enforcement technique and capabilities.

When it comes to AI governance policy, boards should ask:

Show me one example from the last 90 days where AI-generated code was blocked because of our governance policy.
Can AI-generated code reach production without human review? If not, prove it.
Can you trace which parts of production were AI-generated?
What systems are AI tools forbidden from touching and who enforces that?

The policy is decorative if it doesn’t occasionally cause a block, code is unreviewed, provenance is untraceable or critical areas aren’t off-limits.

The digital potato famine

I’ve predicted a scenario where a threat actor neglects to adequately QA their malware, bricking all iOS devices on the latest OS version. This is the monoculture element because a significant percentage of iOS users are always on the latest version of the software. AI might write malware that even Apple’s genius bar cannot reset.

At the other end of the spectrum, Android has so many OEM variants and versions (over 15 major versions with OEM variants) it has natural immunity to a digital potato famine event. Apple, however, will happily sell replacement devices to everyone in the world.

NIST CSF 2.0 moves “govern” to center stage, but these controls won’t work if software and security expertise keep thinning. AI amplifies good and bad patterns, spreading flawed logic quickly. Monoculture risk increases as teams rely on the same AI-generated structures. When we lose the ability to service our own systems, we lose quality itself.

What boards should ask

Boards should ask: “Can you show me exactly where AI-generated code is running in production right now and who is accountable for its behavior?”

CISOs need to answer: “We can identify every instance of AI-generated code, track who approved it, demonstrate how it was reviewed and explain what guardrails prevented it from touching high-risk systems.”

If you can’t answer that question, you don’t have governance. You have well-written fiction.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

Read More from This Article: AI sealed the hood shut. Soon nobody will be able to fix code when it breaks
Source: News