Uncovering Weaknesses in Neural Code Generation

Xiaoli Lian; Shuaisong Wang; Jieping Ma; Fang Liu; Xin Tan; Li Zhang,; Lin Shi; Cuiyun Gao

arXiv:2407.09793·cs.SE·July 18, 2024·2 cites

Uncovering Weaknesses in Neural Code Generation

Xiaoli Lian, Shuaisong Wang, Jieping Ma, Fang Liu, Xin Tan, Li Zhang,, Lin Shi, Cuiyun Gao

PDF

Open Access

TL;DR

This paper systematically evaluates state-of-the-art neural code generation models, identifying key weaknesses such as prompt inaccuracies, missing semantics, and API usage issues, to guide future research improvements.

Contribution

It provides the first comprehensive taxonomy of weaknesses in neural code generation, analyzing multiple models across diverse datasets with detailed thematic insights.

Findings

01

Large models fail in 26.84% of cases due to inaccurate prompts

02

Missing key semantics occurs in over 65% of tasks across datasets

03

All models struggle with proper API usage, especially with vague prompts

Abstract

Code generation, the task of producing source code from prompts, has seen significant advancements with the advent of pre-trained large language models (PLMs). Despite these achievements, there lacks a comprehensive taxonomy of weaknesses about the benchmark and the generated code, which risks the community's focus on known issues at the cost of under-explored areas. Our systematic study aims to fill this gap by evaluating five state-of-the-art PLMs: three larger models, CodeGen2.5 with 7 billion parameters, CodeGeeX2 with 6 billion parameters, GPT-4 Turbo, and two smaller ones, UnixCoder with 110 million parameters and CodeT5 base with 220 million parameters, across three popular datasets, CoNaLa, HumanEval Plus, and DS-1000. We assess the quality of generated code using match-based and execution-based metrics, then conduct thematic analysis to develop a taxonomy of nine types of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications