Toward the Automatic Classification of Self-Affirmed Refactoring

Eman Abdullah AlOmar; Mohamed Wiem Mkaouer; Ali Ouni

arXiv:2009.09279·cs.SE·September 22, 2020

Toward the Automatic Classification of Self-Affirmed Refactoring

Eman Abdullah AlOmar, Mohamed Wiem Mkaouer, Ali Ouni

PDF

TL;DR

This paper presents an automated approach to classify self-affirmed refactoring commits in software development, improving accuracy over previous manual methods and uncovering additional refactoring patterns.

Contribution

It introduces a two-step machine learning model combining N-Gram TF-IDF and classifiers to automatically categorize refactoring commits based on quality improvement categories.

Findings

01

Model achieves up to 90% F-measure in classification accuracy.

02

Outperforms pattern-based and random classifiers.

03

Discovers 40 additional relevant SAR patterns.

Abstract

The concept of Self-Affirmed Refactoring (SAR) was introduced to explore how developers document their refactoring activities in commit messages, i.e., developers' explicit documentation of refactoring operations intentionally introduced during a code change. In our previous study, we have manually identified refactoring patterns and defined three main common quality improvement categories, including internal quality attributes, external quality attributes, and code smells, by only considering refactoring-related commits. However, this approach heavily depends on the manual inspection of commit messages. In this paper, we propose a two-step approach to first identify whether a commit describes developer-related refactoring events, then to classify it according to the refactoring common quality improvement categories. Specifically, we combine the N-Gram TF-IDF feature selection with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFeature Selection