Evaluating and Improving ChatGPT-Based Expansion of Abbreviations
Yanjie Jiang, Hui Liu, Lu Zhang

TL;DR
This paper empirically evaluates ChatGPT for expanding abbreviations in source code, identifies its limitations, and proposes techniques to improve its accuracy, making it comparable to specialized methods without complex parsing.
Contribution
First empirical study on LLM-based abbreviation expansion in source code, analyzing failure causes, and proposing context selection, iterative abbreviation recognition, and post-checking techniques.
Findings
ChatGPT is less accurate than state-of-the-art methods by ~28%.
Context selection improves abbreviation recognition.
Post-condition checks reduce incorrect expansions.
Abstract
Source code identifiers often contain abbreviations. Such abbreviations may reduce the readability of the source code, which in turn hinders the maintenance of the software applications. To this end, accurate and automated approaches to expanding abbreviations in source code are desirable and abbreviation expansion has been intensively investigated. However, to the best of our knowledge, most existing approaches are heuristics, and none of them has even employed deep learning techniques, let alone the most advanced large language models (LLMs). LLMs have demonstrated cutting-edge performance in various software engineering tasks, and thus it has the potential to expand abbreviation automatically. To this end, in this paper, we present the first empirical study on LLM-based abbreviation expansion. Our evaluation results on a public benchmark suggest that ChatGPT is substantially less…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling
