Augmenting the Interpretability of GraphCodeBERT for Code Similarity Tasks
Jorge Martinez-Gil

TL;DR
This paper enhances the interpretability of GraphCodeBERT for code similarity tasks by enabling clearer identification of semantic relationships, aiding developers in understanding and trusting similarity assessments.
Contribution
It introduces a method to improve the transparency of code similarity detection using GraphCodeBERT, balancing semantic accuracy with interpretability.
Findings
Improved interpretability of code similarity results.
Enhanced understanding of semantic relationships in code.
Open-source implementation available.
Abstract
Assessing the degree of similarity of code fragments is crucial for ensuring software quality, but it remains challenging due to the need to capture the deeper semantic aspects of code. Traditional syntactic methods often fail to identify these connections. Recent advancements have addressed this challenge, though they frequently sacrifice interpretability. To improve this, we present an approach aiming to improve the transparency of the similarity assessment by using GraphCodeBERT, which enables the identification of semantic relationships between code fragments. This approach identifies similar code fragments and clarifies the reasons behind that identification, helping developers better understand and trust the results. The source code for our implementation is available at https://www.github.com/jorge-martinez-gil/graphcodebert-interpretability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Software Testing and Debugging Techniques
