Developers still don’t trust AI-generated code

It may come as no surprise that a huge percentage of developers don’t trust AI-generated code, but many also say it’s becoming more difficult to check for errors created by coding assistants.

As AI coding assistants take over an ever-increasing amount of programming work, human coding teams are struggling to find the time to spot the errors, development leaders say. Coding assistants are improving the quality of their output, but they have also tended to become more verbose — writing more lines of code to fix a problem — making it harder to spot errors when they pop up, according to coding experts.

As a result, a major problem with AI-generated code is arising: developers sometimes spending more time reviewing AI output than they would have spent writing the code themselves.

While AI coding assistants have become ubiquitous, developers shouldn’t trust them, says Alex Lisle, CTO of deepfake detection platform Reality Defender.

All the software engineers Lisle works with use coding assistants in some capacity, but the company’s developers keep a close eye on their output, he says. “The truth of the matter is most of my developers don’t use AI-generated code for more than boilerplate and fixing a few little things,” he says. “We don’t trust AI-generated code at all.”

The responsibility for the quality of the code resides with the developer using the coding assistant, Lisle adds.

Cranking out code

The volume of code that can be generated by AI tools creates its own problems, Lisle contends.

“It’s kind of like having a junior developer who can write a very large amount of code very quickly,” he explains. “The problem with that is it doesn’t understand the code and the broader context. It often does the opposite of what you ask it to do.”

Overreliance on AI-generated code can lead to a code base that’s impossible to understand, Lisle says.

“The problem is as soon as you start leveraging it in a broader context, it creates an incredibly unstable and unknowable code,” he adds. “You can get an AI to generate hundreds of thousands of lines of code, but it’s very difficult to maintain, very difficult to understand, and in a production environment, none of that is suitable.”

Microsoft-focused coding firm Keypress Software Development Group has had mixed results with AI coding assistants, says Brian Owens, president and senior software architect there.

AI-generated code often requires only minimal review for small, self-contained use cases, but for production-grade applications, the output can be inconsistent and problematic, he says.

“We’ve found that AI tools will occasionally ignore key aspects of the existing codebase or fail to align with established coding standards and architectural patterns,” Owens says. “That creates additional work for our team in the form of review, refactoring, and rework to ensure the code is production ready.”

Owens has found that using a coding assistant may not save time over a human developer writing the code. “In some cases, the time spent validating and correcting AI-generated code can offset the expected efficiency gains — and occasionally takes more time than if a developer had written the code without AI assistance,” he says.

Major lack of trust

A recent survey of more than 1,100 IT professionals by code quality tool provide Sonar backs up the trust concerns voiced by some development leaders. While 72% of those surveyed say they use coding assistants every day, 96% say they don’t fully trust AI-generated code.

At the same time, less than half of developers say they always check their AI-generated code before committing it, with nearly four in 10 saying that reviewing AI-generated code requires more effort than code written by their human coworkers.

As the quality of coding assistants improves, developers are finding it more difficult to find errors, says Chris Grams, vice president of corporate marketing at Sonar — and not because they aren’t there.

“As these coding models get better and better, you have a little bit of a needle-in-a-haystack problem, where there may be fewer and fewer issues overall, but those issues are going to a big security issue that’s well hidden and hard to find and could be the thing that takes down an application,” he says.

While many software development leaders say they don’t trust AI-generated code, the issue may be more nuanced, says Mark Porter, CTO at data analytics solutions provider dbt Labs.

Coding assistants are widely used at dbt Labs, he says, and trusting their output depends on the context.

“Trusting AI written code is much like trusting human-written code,” he adds. “In general, I look for the same trust signals in AI-generated code as any other code. If it was created via a high-integrity process, I trust it just like I would trust human code.”

The review bottleneck

But AI-generated code can be faulty in different ways than human-written code, with AI often adding complexity and creating overly confident comments, Porter says. The trust equation must adapt to these unique challenges.

Reviewing AI code also creates its own challenges, adding to the developer’s reviewing time, even as it saves coding time, he says. “The output volume is high, so it shifts the bottleneck from producing code to reviewing it,” he adds.

Human reviews also need to maintain expert-level familiarity with the codebase, Porter says, which can be mitigated through software engineering best practices.

“There are certainly unique challenges to reviewing AI-written code,” he adds. “I think the question of trusting AI code is missing the point a bit; what I’m thinking about is how to build processes and guidelines and training that support my engineers using AI to assist them in writing efficient and correct code that is also maintainable — and that’s the future of AI in coding, in my opinion.”

Read More from This Article: Developers still don’t trust AI-generated code
Source: News

Developers still don’t trust AI-generated code

Cranking out code

Major lack of trust

The review bottleneck

Related posts