An Empirical Study of Java Code Improvements Based on Stack Overflow Answer Edits
In-on Wiratsin, Chaiyong Ragkhitwetsagul, Matheus Paixao, Denis De Sousa, Pongpop Lapvikai, Peter Haddawy

TL;DR
This study analyzes how edits to Java answers on Stack Overflow can be used to improve open-source Java code, revealing that nearly half of the answer snippets are applicable for enhancement and some bug fixes are accepted.
Contribution
It introduces an empirical approach to leverage Stack Overflow answer edits for automatic code improvement in open-source Java projects.
Findings
6.91% of SO Java answers have multiple revisions
49.24% of answer snippets are applicable for code improvement
11 proposed bug fixes were accepted by project maintainers
Abstract
Suboptimal code is prevalent in software systems. Developers often write low-quality code due to factors like technical knowledge gaps, insufficient experience, time pressure, management decisions, or personal factors. Once integrated, the accumulation of this suboptimal code leads to significant maintenance costs and technical debt. Developers frequently consult external knowledge bases, such as API documentation and Q&A websites like Stack Overflow (SO), to aid their programming tasks. SO's crowdsourced, collaborative nature has created a vast repository of programming knowledge. Its community-curated content is constantly evolving, with new answers posted or existing ones edited. In this paper, we present an empirical study of SO Java answer edits and their application to improving code in open-source projects. We use a modified code clone search tool to analyze SO code snippets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Spreadsheets and End-User Computing
