Questions for Data Scientists in Software Engineering: A Replication
Hennie Huijgens, Ayushi Rastogi, Ernst Mulders, Georgios Gousios, Arie, van Deursen

TL;DR
This paper replicates a 2014 Microsoft study to assess whether the identified questions for data scientists in software engineering remain relevant across different companies and technological contexts after five years.
Contribution
It provides an updated evaluation of the relevance of previously identified questions for data scientists in diverse software engineering environments.
Findings
Questions remain relevant across different companies.
Technological advances have impacted the applicability of previous questions.
The study highlights evolving research needs in software engineering data science.
Abstract
In 2014, a Microsoft study investigated the sort of questions that data science applied to software engineering should answer. This resulted in 145 questions that developers considered relevant for data scientists to answer, thus providing a research agenda to the community. Fast forward to five years, no further studies investigated whether the questions from the software engineers at Microsoft hold for other software companies, including software-intensive companies with different primary focus (to which we refer as software-defined enterprises). Furthermore, it is not evident that the problems identified five years ago are still applicable, given the technological advances in software engineering.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
