A Systematic Review of Automated Query Reformulations in Source Code Search
Mohammad Masudur Rahman, Chanchal K. Roy

TL;DR
This paper systematically reviews 70 studies on automated query reformulation techniques in source code search, highlighting methodologies, limitations, and future research directions to improve bug fixing and feature addition tasks.
Contribution
It provides a comprehensive analysis of existing query reformulation methods, identifies key limitations, and suggests best practices and future opportunities for research advancement.
Findings
Eight major methodologies identified for query reformulation
Major limitations include lack of generalizability and vocabulary mismatch
Discussion of best practices and future research opportunities
Abstract
Fixing software bugs and adding new features are two of the major maintenance tasks. Software bugs and features are reported as change requests. Developers consult these requests and often choose a few keywords from them as an ad hoc query. Then they execute the query with a search engine to find the exact locations within software code that need to be changed. Unfortunately, even experienced developers often fail to choose appropriate queries, which leads to costly trials and errors during a code search. Over the years, many studies attempt to reformulate the ad hoc queries from developers to support them. In this systematic literature review, we carefully select 70 primary studies on query reformulations from 2,970 candidate studies, perform an in-depth qualitative analysis (e.g., Grounded Theory), and then answer seven research questions with major findings. First, to date, eight…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Software System Performance and Reliability
MethodsHigh-Order Consensuses
