Towards Exploring the Code Reuse from Stack Overflow during Software Development
Yuan Huang, Furen Xu, Haojie Zhou, Xiangping Chen, Xiaocong Zhou, Tong, Wang

TL;DR
This study empirically investigates how developers reuse code snippets from Stack Overflow in open-source Java projects, revealing increasing reuse trends, developer experience influence, and higher reuse in bug fixes and heavily modified classes.
Contribution
It provides the first large-scale empirical analysis of Stack Overflow code reuse in real-world software development, quantifying reuse ratios and identifying influencing factors.
Findings
Average code reuse ratio is 6.32%, increasing over years.
Experienced developers are more likely to reuse SO snippets.
Higher reuse ratio observed in bug-related commits and heavily modified classes.
Abstract
As one of the most well-known programmer Q&A websites, Stack Overflow (i.e., SO) is serving tens of thousands of developers every day. Previous work has shown that many developers reuse the code snippets on SO when they find an answer (from SO) that functionally matches the programming problem they encounter in their development activities. To study how programmers reuse code on SO during project development, we conduct a comprehensive empirical study. First, to capture the development activities of programmers, we collect 342,148 modified code snippets in commits from 793 open-source Java projects, and these modified code can reflect the programming problems encountered during development. We also collect the code snippets from 1,355,617 posts on SO. Then, we employ CCFinder to detect the code clone between the modified code from commits and the code from SO, and further analyze the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Scientific Computing and Data Management
