An Empirical Study on Predictability of Software Code Smell Using Deep Learning Models
Himanshu Gupta, Tanmay G. Kulkarni, Lov Kumar, Lalita Bhanu Murthy, Neti, Aneesh Krishna

TL;DR
This study evaluates deep learning models for predicting software code smells, demonstrating that data balancing techniques significantly improve prediction accuracy over traditional methods.
Contribution
It introduces the use of deep learning models with data sampling and feature selection for code smell prediction, surpassing prior traditional techniques.
Findings
Deep learning models achieved up to 96.84% accuracy.
Data sampling techniques improved model performance.
Deep learning outperformed traditional machine learning methods.
Abstract
Code Smell, similar to a bad smell, is a surface indication of something tainted but in terms of software writing practices. This metric is an indication of a deeper problem lies within the code and is associated with an issue which is prominent to experienced software developers with acceptable coding practices. Recent studies have often observed that codes having code smells are often prone to a higher probability of change in the software development cycle. In this paper, we developed code smell prediction models with the help of features extracted from source code to predict eight types of code smell. Our work also presents the application of data sampling techniques to handle class imbalance problem and feature selection techniques to find relevant feature sets. Previous studies had made use of techniques such as Naive - Bayes and Random forest but had not explored deep learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFeature Selection
