Investigating Issues that Lead to Code Technical Debt in Machine Learning Systems
Rodrigo Ximenes, Antonio Pedro Santos Alves, Tatiana Escovedo, and Rodrigo Spinola, Marcos Kalinowski

TL;DR
This paper identifies and discusses 30 key issues in machine learning code that contribute to technical debt, emphasizing the importance of addressing these issues throughout the ML workflow to improve maintainability.
Contribution
The study provides a refined list of 30 ML-specific code issues contributing to technical debt, validated through expert focus groups, highlighting critical phases and common shortcuts.
Findings
Pre-processing phase has the most relevant issues.
Shortcuts in data handling lead to patch fixes and increased debt.
Addressing identified issues can reduce maintenance costs.
Abstract
[Context] Technical debt (TD) in machine learning (ML) systems, much like its counterpart in software engineering (SE), holds the potential to lead to future rework, posing risks to productivity, quality, and team morale. Despite growing attention to TD in SE, the understanding of ML-specific code-related TD remains underexplored. [Objective] This paper aims to identify and discuss the relevance of code-related issues that lead to TD in ML code throughout the ML workflow. [Method] The study first compiled a list of 34 potential issues contributing to TD in ML code by examining the phases of the ML workflow, their typical associated activities, and problem types. This list was refined through two focus group sessions involving nine experienced ML professionals, where each issue was assessed based on its occurrence contributing to TD in ML code and its relevance. [Results] The list of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Software Engineering Research
