Skip to Main Content
PCMag editors select and review products independently. If you buy through affiliate links, we may earn commissions, which help support our testing.

Watch Out, Software Engineers: ChatGPT Is Now Finding, Fixing Bugs in Code

A new study asks ChatGPT to find bugs in sample code and suggest a fix. It works better than existing programs, fixing 31 out of 40 bugs.

January 27, 2023
(Credit: Bloomberg / Contributor / Getty Images)

AI bot ChatGPT has been put to the test on a number of tasks in recent weeks, and its latest challenge comes courtesy of computer science researchers from Johannes Gutenberg University and University College London, who find that ChatGPT can weed out errors with sample code and fix it better than existing programs designed to do the same.

Researchers gave 40 pieces of buggy code to four different code-fixing systems: ChatGPT, Codex, CoCoNut, and Standard APR. Essentially, they asked ChatGPT: "What's wrong with this code?" and then copy and pasted it into the chat function.

On the first pass, ChatGPT performed about as well as the other systems. ChatGPT solved 19 problems, Codex solved 21, CoCoNut solved 19, and standard APR methods figured out seven. The researchers found its answers to be most similar to Codex, which was "not surprising, as ChatGPT and Codex are from the same family of language models."

However, the ability to, well, chat with ChatGPT after receiving the initial answer made the difference, ultimately leading to ChatGPT solving 31 questions, and easily outperforming the others, which provided more static answers.

"A powerful advantage of ChatGPT is that we can interact with the system in a dialogue to specify a request in more detail," the researchers' report says. "We see that for most of our requests, ChatGPT asks for more information about the problem and the bug. By providing such hints to ChatGPT, its success rate can be further increased, fixing 31 out of 40 bugs, outperforming state-of-the-art."

They found that ChatGPT was able to solve some problems quickly, while others took more back and forth. "ChatGPT seems to have a relatively high variance when fixing bugs," the study says. "For an end-user, however, this means that it can be helpful to execute requests multiple times."

For example, when the researchers asked the question pictured below, they expected ChatGPT to recommend replacing n^=n-1 with n&=n-1, but the first thing ChatGPT said was, "I'm unable to tell if the program has a bug without more information on the expected behavior." On ChatGPT's third response, after more prompting from researchers, it found the problem.

Code for ChatGPT Study
(Credit: Dominik Sobania, Martin Briesch, Carol Hanna, Justyna Petke)

However, when PCMag entered the same question into ChatGPT, it answered differently. Rather than needing to tell it what the expected behavior is, it guessed what it was. ChatGPT is always learning based on inputs from users, and it seems it learned what this piece of code is intended to do—perhaps from the researchers who did the study. The exchange we had was different than the researchers', and will likely be different the next time as well.

ChatGPT answer.
ChatGPT answer (Credit: Emily Dreibelbis/ChatGPT)

The success of the study has the potential to redefine the existing, $600 million industry dedicated to helping software engineers find and fix bugs. Popular platforms such as Sentry have become standard tools within software teams, greatly speeding up their ability to create working code by issuing reports on issues and suggesting fixes.

Just as Google issued a "code red" regarding ChatGPT's impressive search results, teachers are shutting down student access to prevent cheating. ChatGPT recently passed an MBA exam issued by a Wharton professor, though just barely.

Companies that create bug-fixing software—and software engineers themselves—are taking note. However, an obvious barrier to tech companies adopting ChatGPT on a platform like Sentry in its current form is that it's a public database (the last place a company wants its engineers to send coveted intellectual property).

ChatGPT's next move is launching a paid version, reportedly for $42 per month.

Editors' Note: This story originally said the researchers were from Cornell University. Their paper was posted on arXiv.org, which is maintained by the Cornell University Library, but they are from Johannes Gutenberg University and University College London.

Get Our Best Stories!

Sign up for What's New Now to get our top stories delivered to your inbox every morning.

This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of Use and Privacy Policy. You may unsubscribe from the newsletters at any time.


Thanks for signing up!

Your subscription has been confirmed. Keep an eye on your inbox!

Sign up for other newsletters

TRENDING

About Emily Dreibelbis

Reporter

Prior to starting at PCMag, I worked in Big Tech on the West Coast for six years. From that time, I got an up-close view of how software engineering teams work, how good products are launched, and the way business strategies shift over time. After I’d had my fill, I changed course and enrolled in a master’s program for journalism at Northwestern University in Chicago. I'm now a reporter with a focus on electric vehicles and artificial intelligence.

Read Emily's full bio

Read the latest from Emily Dreibelbis