Augmenting Machine Learning with Information Retrieval to Recommend Real Cloned Code Methods for Code Completion
Muhammad Hammad, \"Onder Babur, Hamid Abdul Basit

TL;DR
This paper combines deep learning and information retrieval to improve code clone recommendation accuracy, aiding developers in faster and more reliable code reuse.
Contribution
It introduces a novel IR-enhanced method that refines DeepClone's clone predictions, significantly improving recommendation quality.
Findings
IR technique improves clone recommendation accuracy
Significant enhancement over DeepClone alone
Quantitative evaluation confirms effectiveness
Abstract
Software developers frequently reuse source code from repositories as it saves development time and effort. Code clones accumulated in these repositories hence represent often repeated functionalities and are candidates for reuse in an exploratory or rapid development. In previous work, we introduced DeepClone, a deep neural network model trained by fine tuning GPT-2 model over the BigCloneBench dataset to predict code clone methods. The probabilistic nature of DeepClone output generation can lead to syntax and logic errors that requires manual editing of the output for final reuse. In this paper, we propose a novel approach of applying an information retrieval (IR) technique on top of DeepClone output to recommend real clone methods closely matching the predicted output. We have quantitatively evaluated our strategy, showing that the proposed approach significantly improves the quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Scientific Computing and Data Management · Advanced Malware Detection Techniques
