Towards Informative Tagging of Code Fragments to Support the Investigation of Code Clones
Daisuke Nishioka, Toshihiro Kamiya

TL;DR
This paper introduces a method for clustering code clone fragments based on topic similarity and assigning informative tags to facilitate easier investigation of clones, demonstrated on open source OS packages.
Contribution
It presents a novel approach combining syntactic and topic-based clustering of code clones and a tagging method to improve clone analysis efficiency.
Findings
Effective clustering of code clones by topic similarity.
Successful tagging of code clone clusters.
Application to open source OS packages demonstrated practicality.
Abstract
Investigating the code fragments of code clones detected by code clone detection tools is a time-consuming task, especially when a large number of reference source files are available. This paper proposes (i) a method for clustering a clone class, which is detected by code clone detection tools using syntactic similarity, based on topic similarity by considering its code fragments as sequences of words and (ii) a method for assigning short tags to clusters of the clustering result. We also report an experiment of applying the proposed method to packages of an open source operating system.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Advanced Malware Detection Techniques
