Data-Driven Extract Method Recommendations: A Study at ING
David van der Leij, Jasper Binda, Robbert van Dalen, Pieter, Vallen, Yaping Luo, Maur\'icio Aniche

TL;DR
This study evaluates machine learning models for recommending Extract Method refactorings in ING's software, comparing model suggestions with expert opinions and open-source data, demonstrating high accuracy and generalizability.
Contribution
It presents an empirical analysis of ML-based refactoring recommendations in a large financial organization, highlighting differences in code metrics and validating model effectiveness.
Findings
Models recommend Extract Method refactorings with high accuracy.
Experts generally agree with the model recommendations.
Code metrics distributions differ between ING and open-source systems.
Abstract
The sound identification of refactoring opportunities is still an open problem in software engineering. Recent studies have shown the effectiveness of machine learning models in recommending methods that should undergo different refactoring operations. In this work, we experiment with such approaches to identify methods that should undergo an Extract Method refactoring, in the context of ING, a large financial organization. More specifically, we (i) compare the code metrics distributions, which are used as features by the models, between open-source and ING systems, (ii) measure the accuracy of different machine learning models in recommending Extract Method refactorings, (iii) compare the recommendations given by the models with the opinions of ING experts. Our results show that the feature distributions of ING systems and open-source systems are somewhat different, that machine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
