On the Interaction between Software Engineers and Data Scientists when   building Machine Learning-Enabled Systems

Gabriel Busquim; Hugo Villamizar; Maria Julia Lima; Marcos Kalinowski

arXiv:2402.05334·cs.SE·February 9, 2024·2 cites

On the Interaction between Software Engineers and Data Scientists when building Machine Learning-Enabled Systems

Gabriel Busquim, Hugo Villamizar, Maria Julia Lima, Marcos Kalinowski

PDF

Open Access

TL;DR

This paper explores the collaboration challenges between software engineers and data scientists in ML projects through a case study, highlighting issues and proposing solutions to improve teamwork and system documentation.

Contribution

It provides an empirical analysis of role interactions in ML projects, identifying key collaboration challenges and suggesting practical improvements.

Findings

01

Differences in technical expertise hinder collaboration

02

Unclear role definitions create confusion

03

Lack of documentation impairs system specification

Abstract

In recent years, Machine Learning (ML) components have been increasingly integrated into the core systems of organizations. Engineering such systems presents various challenges from both a theoretical and practical perspective. One of the key challenges is the effective interaction between actors with different backgrounds who need to work closely together, such as software engineers and data scientists. This paper presents an exploratory case study to understand the current interaction and collaboration dynamics between these roles in ML projects. We conducted semi-structured interviews with four practitioners with experience in software engineering and data science of a large ML-enabled system project and analyzed the data using reflexive thematic analysis. Our findings reveal several challenges that can hinder collaboration between software engineers and data scientists, including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Big Data and Business Intelligence