Clone-Seeker: Effective Code Clone Search Using Annotations
Muhammad Hammad, \"Onder Babur, Hamid Abdul Basit, Mark van den Brand

TL;DR
Clone-Seeker is a novel code clone search method that leverages semantic metadata and annotations to improve recall and accuracy in retrieving similar code fragments using natural language or code queries.
Contribution
It introduces a metadata-based approach utilizing annotations and keywords for effective code clone retrieval, especially for semantic clones, outperforming existing methods.
Findings
Higher recall for semantic code clones in BigCloneBench
Accurate retrieval with natural language queries
Outperforms state-of-the-art in clone search
Abstract
Source code search plays an important role in software development, e.g. for exploratory development or opportunistic reuse of existing code from a code base. Often, exploration of different implementations with the same functionality is needed for tasks like automated software transplantation, software diversification, and software repair. Code clones, which are syntactically or semantically similar code fragments, are perfect candidates for such tasks. Searching for code clones involves a given search query to retrieve the relevant code fragments. We propose a novel approach called Clone-Seeker that focuses on utilizing clone class features in retrieving code clones. For this purpose, we generate metadata for each code clone in the form of a natural language document. The metadata includes a pre-processed list of identifiers from the code clones augmented with a list of keywords…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Software Engineering Techniques and Practices
