Towards Understanding the Characteristics of Code Generation Errors Made   by Large Language Models

Zhijie Wang; Zijie Zhou; Da Song; Yuheng Huang; Shengmai Chen; Lei Ma,; Tianyi Zhang

arXiv:2406.08731·cs.SE·February 14, 2025·2 cites

Towards Understanding the Characteristics of Code Generation Errors Made by Large Language Models

Zhijie Wang, Zijie Zhou, Da Song, Yuheng Huang, Shengmai Chen, Lei Ma,, Tianyi Zhang

PDF

Open Access

TL;DR

This paper provides a detailed analysis of code generation errors by large language models, revealing their characteristics, root causes, and challenges in error detection and correction, based on a comprehensive taxonomy and empirical study.

Contribution

It introduces a new taxonomy of code generation errors and analyzes error characteristics across multiple LLMs, addressing a gap in understanding their failure modes.

Findings

01

LLMs often produce multi-line, non-trivial errors

02

Error frequency correlates with task complexity and pass rate

03

Locating and fixing errors remains challenging

Abstract

Large Language Models (LLMs) have demonstrated unprecedented capabilities in code generation. However, there remains a limited understanding of code generation errors that LLMs can produce. To bridge the gap, we conducted an in-depth analysis of code generation errors across six representative LLMs on the HumanEval dataset. Specifically, we first employed open coding and thematic analysis to distill a comprehensive taxonomy of code generation errors. We analyzed two dimensions of error characteristics -- semantic characteristics and syntactic characteristics. Our analysis revealed that LLMs often made non-trivial, multi-line code generation errors in various locations and with various root causes. We further analyzed the correlation between these errors and task complexity as well as test pass rate. Our findings highlighted several challenges in locating and fixing code generation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Software Engineering Research