OSS effort estimation using software features similarity and developer activity-based metrics
Ritu Kapur, Balwinder Sodhi

TL;DR
This paper introduces a novel effort estimation method for open source software that combines developer activity metrics with software description similarity, achieving high accuracy and providing a practical tool for project effort prediction.
Contribution
The paper presents new effort estimation metrics derived from developer activity, a large dataset from GitHub, and a machine learning-based estimation tool using description similarity.
Findings
Achieved 87.26% Standard Accuracy in effort estimation
Developed a machine learning model trained on GitHub software descriptions
Provided a publicly available effort estimation tool and dataset
Abstract
Software development effort estimation (SDEE) generally involves leveraging the information about the effort spent in developing similar software in the past. Most organizations do not have access to sufficient and reliable forms of such data from past projects. As such, the existing SDEE methods suffer from low usage and accuracy. We propose an efficient SDEE method for open source software, which provides accurate and fast effort estimates. The significant contributions of our paper are i) Novel SDEE software metrics derived from developer activity information of various software repositories, ii) SDEE dataset comprising the SDEE metrics' values derived from GitHub repositories from 150 different software categories, iii) an effort estimation tool based on SDEE metrics and a software description similarity model. Our software description similarity model is basically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
