Empirical Characterization of Logging Smells in Machine Learning Code

Patrick Loic Foalem; Leuson Da Silva; Foutse Khomh; Heng Li; Ettore Merlo

arXiv:2603.23769·cs.SE·March 26, 2026

Empirical Characterization of Logging Smells in Machine Learning Code

Patrick Loic Foalem, Leuson Da Silva, Foutse Khomh, Heng Li, Ettore Merlo

PDF

Open Access

TL;DR

This study empirically analyzes logging smells in machine learning code, revealing their prevalence, types, and impact on system quality, and provides a dataset for future research.

Contribution

It introduces a taxonomy of ML-specific logging smells, analyzes 444 repositories, and validates the relevance of these smells through a practitioner survey.

Findings

01

Logging smells are widespread in ML systems.

02

Certain smells significantly impact reproducibility and maintainability.

03

A publicly available dataset supports future detection research.

Abstract

Logging plays a central role in ensuring reproducibility, observability, and reliability in machine learning (ML) systems. While logging is generally considered a good engineering practice, poorly designed logging can negatively affect experiment tracking, security, debugging, and system performance. In this paper, we present an empirical study of logging smells in ML projects and propose a taxonomy of ML-specific logging smell types. We conducted a large-scale analysis of 444 ML repositories and manually labeled 2,448 instances of logging smells. Based on this analysis, we identified 12 categories of logging smells spanning security, metric management, configuration, verbosity, and context-related issues. Our results show that logging smells are widespread in ML systems and vary in frequency and manifestation across projects. To assess practical relevance, we conducted a survey…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Software Engineering Research · Machine Learning and Data Classification