Empirical Analysis on Effectiveness of NLP Methods for Predicting Code Smell
Himanshu Gupta, Abhiram Anand Gulanikar, Lov Kumar, Lalita Bhanu, Murthy Neti

TL;DR
This paper empirically evaluates NLP-based features derived from user comments to predict code smells, demonstrating high accuracy with kernel methods, especially the radial basis functional kernel.
Contribution
It introduces a novel approach of using user comments as features for code smell detection and compares multiple kernel methods for improved accuracy.
Findings
Radial basis functional kernel achieves 98.52% accuracy.
Using user comments as features enhances code smell prediction.
Kernel methods outperform traditional feature-based approaches.
Abstract
A code smell is a surface indicator of an inherent problem in the system, most often due to deviation from standard coding practices on the developers part during the development phase. Studies observe that code smells made the code more susceptible to call for modifications and corrections than code that did not contain code smells. Restructuring the code at the early stage of development saves the exponentially increasing amount of effort it would require to address the issues stemming from the presence of these code smells. Instead of using traditional features to detect code smells, we use user comments to manually construct features to predict code smells. We use three Extreme learning machine kernels over 629 packages to identify eight code smells by leveraging feature engineering aspects and using sampling techniques. Our findings indicate that the radial basis functional kernel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
