From Query to Usable Code: An Analysis of Stack Overflow Code Snippets
Di Yang, Aftab Hussain, Cristina Lopes

TL;DR
This paper evaluates the usability of Stack Overflow code snippets across four languages, analyzing their potential for automated code generation tools and examining how natural language context influences snippet usefulness.
Contribution
It provides a comprehensive analysis of snippet usability in four languages and explores how natural language annotations relate to snippet quality for automated tools.
Findings
Python and JavaScript have the highest usable snippet rates.
Java and C# have the lowest usability rates.
Usable snippets often have specific characteristics in answers.
Abstract
Enriched by natural language texts, Stack Overflow code snippets are an invaluable code-centric knowledge base of small units of source code. Besides being useful for software developers, these annotated snippets can potentially serve as the basis for automated tools that provide working code solutions to specific natural language queries. With the goal of developing automated tools with the Stack Overflow snippets and surrounding text, this paper investigates the following questions: (1) How usable are the Stack Overflow code snippets? and (2) When using text search engines for matching on the natural language questions and answers around the snippets, what percentage of the top results contain usable code snippets? A total of 3M code snippets are analyzed across four languages: C\#, Java, JavaScript, and Python. Python and JavaScript proved to be the languages for which the most…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Software Testing and Debugging Techniques
