Predicting the Number of Reported Bugs in a Software Repository

Hadi Jahanshahi; Mucahit Cevik; Ay\c{s}e Ba\c{s}ar

arXiv:2104.12001·cs.SE·April 27, 2021

Predicting the Number of Reported Bugs in a Software Repository

Hadi Jahanshahi, Mucahit Cevik, Ay\c{s}e Ba\c{s}ar

PDF

1 Repo

TL;DR

This paper compares eight time series forecasting models, including LSTM, ARIMA, and Random Forest, to predict bug counts in a large open-source project, highlighting the strengths of each for short-term and long-term predictions.

Contribution

It evaluates the effectiveness of various forecasting models and the impact of exogenous variables on bug prediction accuracy in software repositories.

Findings

01

LSTM excels in long-term bug prediction.

02

Random Forest with exogenous variables performs best for short-term forecasts.

03

Incorporating release dates improves prediction accuracy.

Abstract

The bug growth pattern prediction is a complicated, unrelieved task, which needs considerable attention. Advance knowledge of the likely number of bugs discovered in the software system helps software developers in designating sufficient resources at a convenient time. The developers may also use such information to take necessary actions to increase the quality of the system and in turn customer satisfaction. In this study, we examine eight different time series forecasting models, including Long Short Term Memory Neural Networks (LSTM), auto-regressive integrated moving average (ARIMA), and Random Forest Regressor. Further, we assess the impact of exogenous variables such as software release dates by incorporating those into the prediction models. We analyze the quality of long-term prediction for each model based on different performance metrics. The assessment is conducted on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HadiJahanshahi/Bug-Number-Prediction
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory