Improving Software Engineering in Biostatistics: Challenges and Opportunities
Daniel Saban\'es Bov\'e, Heidi Seibold, Anne-Laure Boulesteix, Juliane Manitz, Alessandro Gasparini, Burak K. G\"unhan, Oliver Boix, Armin Sch\"uler, Sven Fillinger, Sven Nahnsen, Anna E. Jacob, Thomas Jaki

TL;DR
This paper discusses the challenges faced in applying software engineering principles to biostatistics, emphasizing education, collaboration, and tools to improve reproducibility, code quality, and long-term maintenance.
Contribution
It highlights the importance of integrating software engineering practices into biostatistics and proposes strategies like education, dedicated teams, and community tools to address current challenges.
Findings
Software engineering practices can improve reproducibility and code quality in biostatistics.
Dedicated teams and education are key to adopting better software practices.
Community tools enhance transparency and collaboration.
Abstract
Programming is ubiquitous in applied biostatistics; adopting software engineering skills will help biostatisticians do a better job. To explain this, we start by highlighting key challenges for software development and application in biostatistics. Silos between different statistician roles, projects, departments, and organizations lead to the development of duplicate and suboptimal code. Building on top of open-source software requires critical appraisal and risk-based assessment of the used modules. Code that is written needs to be readable to ensure reliable software. The software needs to be easily understandable for the user, as well as developed within testing frameworks to ensure that long term maintenance of the software is feasible. Finally, the reproducibility of research results is hindered by manual analysis workflows and uncontrolled code development. We next describe how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
