TL;DR
CoNCRA is a convolutional neural network-based method for semantic code retrieval that improves accuracy in matching code snippets to natural language queries, outperforming previous approaches on Stack Overflow data.
Contribution
This paper introduces CoNCRA, a novel CNN-based approach that emphasizes local word interactions to enhance semantic code search accuracy.
Findings
Improved top-3 retrieval relevance by 80%.
Achieved 5% better performance than state-of-the-art methods.
Effective on Stack Overflow dataset.
Abstract
Software developers routinely search for code using general-purpose search engines. However, these search engines cannot find code semantically unless it has an accompanying description. We propose a technique for semantic code search: A Convolutional Neural Network approach to code retrieval (CoNCRA). Our technique aims to find the code snippet that most closely matches the developer's intent, expressed in natural language. We evaluated our approach's efficacy on a dataset composed of questions and code snippets collected from Stack Overflow. Our preliminary results showed that our technique, which prioritizes local interactions (words nearby), improved the state-of-the-art (SOTA) by 5% on average, retrieving the most relevant code snippets in the top 3 (three) positions by almost 80% of the time. Therefore, our technique is promising and can improve the efficacy of semantic code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
