Code Search: A Survey of Techniques for Finding Code
Luca Di Grazia, Michael Pradel

TL;DR
This survey comprehensively reviews 30 years of research on code search techniques, covering query types, indexing, retrieval, ranking, and practical studies, highlighting ongoing challenges and future opportunities.
Contribution
It provides a detailed overview of existing code search methods, challenges, and empirical insights, offering a foundation for future research directions.
Findings
Various query types supported by code search engines
Techniques for indexing and retrieving code effectively
Empirical studies reveal practical challenges in code search
Abstract
The immense amounts of source code provide ample challenges and opportunities during software development. To handle the size of code bases, developers commonly search for code, e.g., when trying to find where a particular feature is implemented or when looking for code examples to reuse. To support developers in finding relevant code, various code search engines have been proposed. This article surveys 30 years of research on code search, giving a comprehensive overview of challenges and techniques that address them. We discuss the kinds of queries that code search engines support, how to preprocess and expand queries, different techniques for indexing and retrieving code, and ways to rank and prune search results. Moreover, we describe empirical studies of code search in practice. Based on the discussion of prior work, we conclude the article with an outline of challenges and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Software System Performance and Reliability
