A Differential Testing Approach for Evaluating Abstract Syntax Tree Mapping Algorithms
Yuanrui Fan, Xin Xia, David Lo, Ahmed E. Hassan, Yuan Wang, Shanping, Li

TL;DR
This paper introduces a hierarchical differential testing approach to evaluate the accuracy of AST mapping algorithms, revealing significant inaccuracies in widely used algorithms through expert validation and large-scale analysis.
Contribution
It proposes a novel hierarchical comparison method for AST mappings and provides the first large-scale evaluation of popular algorithms' accuracy.
Findings
High precision (0.98-1.00) in expert validation
Significant inaccuracy rates (20-36%) in existing algorithms
Demonstrates need for improving AST mapping algorithms
Abstract
Abstract syntax tree (AST) mapping algorithms are widely used to analyze changes in source code. Despite the foundational role of AST mapping algorithms, little effort has been made to evaluate the accuracy of AST mapping algorithms, i.e., the extent to which an algorihtm captures the evolution of code. We observe that a program element often has only one best-mapped program element. Based on this observation, we propose a hierarchical approach to automatically compare the similarity of mapped statements and tokens by different algorithms. By performing the comparison, we determine if each of the compared algorithms generates inaccurate mappings for a statement or its tokens. We invite 12 external experts to determine if three commonly used AST mapping algorithms generate accurate mappings for a statement and its tokens for 200 statements. Based on the experts' feedback,we observe that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Advanced Malware Detection Techniques
