Zero-shot Evaluation of Deep Learning for Java Code Clone Detection
Thomas S. Heinze

TL;DR
This paper evaluates the ability of deep learning models to detect Java code clones in zero-shot scenarios, revealing limited generalizability and outperforming traditional tools in some cases.
Contribution
It provides a comprehensive zero-shot evaluation of state-of-the-art DL-based clone detectors for Java, highlighting their limitations compared to conventional methods.
Findings
DL models have limited generalizability to unseen code.
NiCad outperforms DL models in zero-shot scenarios.
Transformers like CodeBERT do not significantly outperform traditional tools.
Abstract
Deep Learning (DL) is becoming more and more widespread in clone detection, motivated by achieving near-perfect performance for this task. In particular in case of semantic code clones, which share only limited syntax but implement the same or similar functionality, Deep Learning appears to outperform conventional tools. In this paper, we want to investigate the generalizability of DL-based clone detectors for Java. We therefore replicate and evaluate the performance of five state-of-the-art DL-based clone detectors, including Transformers like CodeBERT and single-task models like FA-AST+GMN, in a zero-shot evaluation scenario, where we train/fine-tune and evaluate on different datasets and functionalities. Our experiments demonstrate that the models' generalizability to unseen code is limited. Further analysis reveals that the conventional clone detector NiCad even outperforms the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
