Beyond Functional Correctness: Investigating Coding Style Inconsistencies in Large Language Models
Yanlin Wang, Tianyue Jiang, Mingwei Liu, Jiachi Chen, Mingzhi Mao, Xilin Liu, Yuchi Ma, Zibin Zheng

TL;DR
This paper investigates coding style differences between large language models and human programmers, revealing inconsistencies and proposing solutions to improve code quality and consistency.
Contribution
It provides an empirical analysis of coding style inconsistencies in LLM-generated code and offers a taxonomy and potential solutions for addressing these issues.
Findings
LLMs and humans exhibit different coding styles.
Coding style inconsistencies affect readability, conciseness, and robustness.
Proposed solutions can mitigate style inconsistencies.
Abstract
Large language models (LLMs) have brought a paradigm shift to the field of code generation, offering the potential to enhance the software development process. However, previous research mainly focuses on the accuracy of code generation, while coding style differences between LLMs and human developers remain under-explored. In this paper, we empirically analyze the differences in coding style between the code generated by mainstream Code LLMs and the code written by human developers, and summarize coding style inconsistency taxonomy. Specifically, we first summarize the types of coding style inconsistencies by manually analyzing a large number of generation results. We then compare the code generated by Code LLMs with the code written by human programmers in terms of readability, conciseness, and robustness. The results reveal that LLMs and developers have different coding styles.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
