Hyperparameter Optimization for AST Differencing
Matias Martinez, Jean-R\'emy Falleri, Martin Monperrus

TL;DR
This paper introduces DAT, a data-driven method for hyperparameter tuning of AST differencing algorithms, significantly improving their accuracy in software version comparison tasks.
Contribution
The paper presents a novel hyperparameter optimization approach, DAT, specifically designed for AST differencing algorithms like GumTree, enhancing their effectiveness.
Findings
DAT improves edit-scripts in 21.8% of cases
Optimized configurations outperform default settings
The approach is effective across different scenarios
Abstract
Computing the differences between two versions of the same program is an essential task for software development and software evolution research. AST differencing is the most advanced way of doing so, and an active research area. Yet, AST differencing algorithms rely on configuration parameters that may have a strong impact on their effectiveness. In this paper, we present a novel approach named DAT (Diff Auto Tuning) for hyperparameter optimization of AST differencing. We thoroughly state the problem of hyper-configuration for AST differencing. We evaluate our data-driven approach DAT to optimize the edit-scripts generated by the state-of-the-art AST differencing algorithm named GumTree in different scenarios. DAT is able to find a new configuration for GumTree that improves the edit-scripts in 21.8% of the evaluated cases.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Machine Learning and Data Classification
