Automatic Identification of Machine Learning-Specific Code Smells
Peter Hamfelt, Ricardo Britto, Lincoln Rocha, Camilo Almendra

TL;DR
This paper introduces MLpylint, a static analysis tool designed to identify ML-specific code smells, validated through empirical evaluation and expert surveys, to improve code quality in machine learning applications.
Contribution
It presents the first dedicated tool for detecting ML-specific code smells, developed using Design Science methodology and validated with real-world data and expert feedback.
Findings
MLpylint effectively detects ML-specific code smells.
The tool is considered useful by ML professionals.
Validation shows promising results for integrating into development workflows.
Abstract
Machine learning (ML) has rapidly grown in popularity, becoming vital to many industries. Currently, the research on code smells in ML applications lacks tools and studies that address the identification and validity of ML-specific code smells. This work investigates suitable methods and tools to design and develop a static code analysis tool (MLpylint) based on code smell criteria. This research employed the Design Science Methodology. In the problem identification phase, a literature review was conducted to identify ML-specific code smells. In solution design, a secondary literature review and consultations with experts were performed to select methods and tools for implementing the tool. We evaluated the tool on data from 160 open-source ML applications sourced from GitHub. We also conducted a static validation through an expert survey involving 15 ML professionals. The results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Scientific Computing and Data Management · Software Engineering Techniques and Practices
