Accelerating Code Search with Deep Hashing and Code Classification
Wenchao Gu, Yanlin Wang, Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang,, and Michael R. Lyu

TL;DR
This paper introduces CoSHC, a deep hashing and code classification method that significantly accelerates code search processes while maintaining high accuracy, reducing retrieval time by over 90%.
Contribution
The paper presents a novel approach combining deep hashing and code classification to enhance code search efficiency without substantial accuracy loss.
Findings
Over 90% reduction in retrieval time
Maintains at least 99% of original accuracy
Effective across multiple code search models
Abstract
Code search is to search reusable code snippets from source code corpus based on natural languages queries. Deep learning-based methods of code search have shown promising results. However, previous methods focus on retrieval accuracy but lacked attention to the efficiency of the retrieval process. We propose a novel method CoSHC to accelerate code search with deep hashing and code classification, aiming to perform an efficient code search without sacrificing too much accuracy. To evaluate the effectiveness of CoSHC, we apply our method to five code search models. Extensive experimental results indicate that compared with previous code search baselines, CoSHC can save more than 90% of retrieval time meanwhile preserving at least 99% of retrieval accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Web Data Mining and Analysis · Advanced Image and Video Retrieval Techniques
