On Using Stack Overflow Comment-Edit Pairs to recommend code maintenance changes
Henry Tang, Sarah Nadi

TL;DR
This paper explores using Stack Overflow comment-edit pairs as a new data source for code maintenance tasks, demonstrating their potential usefulness and providing a method to extract and categorize these pairs.
Contribution
It introduces a technique to extract and analyze Stack Overflow comment-edit pairs, showing their potential for supporting code maintenance applications.
Findings
Majority of pairs are not tangled
27% of pairs are potentially useful
Accepted pull requests on GitHub demonstrate practical utility
Abstract
Code maintenance data sets typically consist of a before and after version of the code that contains the improvement or fix. Such data sets are important for software engineering support tools related to code maintenance, such as program repair, code recommender systems, or Application Programming Interface (API) misuse detection. Most of the current data sets are constructed from mining commit history in version-control systems or issues in issue-tracking systems. In this paper, we investigate whether Stack Overflow can be used as an additional data source. Comments on Stack Overflow provide an effective way for developers to point out problems with existing answers, alternative solutions, or pitfalls. In this paper, we mine comment-edit pairs from Stack Overflow and investigate their potential usefulness. These pairs have the added benefit of having concrete descriptions of why the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Web Application Security Vulnerabilities
