Do Not Take It for Granted: Comparing Open-Source Libraries for Software Development Effort Estimation
Rebecca Moussa, Federica Sarro

TL;DR
This study compares the performance and API differences of three popular open-source ML libraries in software effort estimation, revealing significant prediction discrepancies and highlighting the importance of library choice.
Contribution
It provides a comprehensive empirical analysis of how different ML libraries affect software effort estimation results and documentation clarity, an area previously overlooked.
Findings
Predictions differ in 95% of cases across libraries
Differences can lead to misestimations of up to 3,000 hours
Libraries vary in control and clarity of parameters
Abstract
In the past two decades, several Machine Learning (ML) libraries have become freely available. Many studies have used such libraries to carry out empirical investigations on predictive Software Engineering (SE) tasks. However, the differences stemming from using one library over another have been overlooked, implicitly assuming that using any of these libraries would provide the user with the same or very similar results. This paper aims at raising awareness of the differences incurred when using different ML libraries for software development effort estimation (SEE), one of most widely studied SE prediction tasks. To this end, we investigate 4 deterministic machine learners as provided by 3 of the most popular ML open-source libraries written in different languages (namely, Scikit-Learn, Caret and Weka). We carry out a thorough empirical study comparing the performance of the machine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Software Reliability and Analysis Research
MethodsLib
