Investigating the Impact of SOLID Design Principles on Machine Learning   Code Understanding

Raphael Cabral; Marcos Kalinowski; Maria Teresa Baldassarre; Hugo; Villamizar; Tatiana Escovedo; H\'elio Lopes

arXiv:2402.05337·cs.SE·February 9, 2024·1 cites

Investigating the Impact of SOLID Design Principles on Machine Learning Code Understanding

Raphael Cabral, Marcos Kalinowski, Maria Teresa Baldassarre, Hugo, Villamizar, Tatiana Escovedo, H\'elio Lopes

PDF

Open Access

TL;DR

This study investigates how applying SOLID design principles to machine learning code affects understanding, finding that such principles significantly improve code comprehension among data scientists.

Contribution

It provides empirical evidence that SOLID principles enhance ML code understanding, advocating for their adoption in data science practices.

Findings

01

SOLID principles improve ML code understanding

02

Statistically significant positive effect observed

03

Supports spreading software engineering practices in ML community

Abstract

[Context] Applying design principles has long been acknowledged as beneficial for understanding and maintainability in traditional software projects. These benefits may similarly hold for Machine Learning (ML) projects, which involve iterative experimentation with data, models, and algorithms. However, ML components are often developed by data scientists with diverse educational backgrounds, potentially resulting in code that doesn't adhere to software design best practices. [Goal] In order to better understand this phenomenon, we investigated the impact of the SOLID design principles on ML code understanding. [Method] We conducted a controlled experiment with three independent trials involving 100 data scientists. We restructured real industrial ML code that did not use SOLID principles. Within each trial, one group was presented with the original ML code, while the other was presented…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research