An Exploratory Study on Just-in-Time Multi-Programming-Language Bug   Prediction

Zengyang Li; Jiabao Ji; Peng Liang; Ran Mo; Hui Liu

arXiv:2407.10906·cs.SE·July 16, 2024

An Exploratory Study on Just-in-Time Multi-Programming-Language Bug Prediction

Zengyang Li, Jiabao Ji, Peng Liang, Ran Mo, Hui Liu

PDF

Open Access

TL;DR

This study develops and evaluates machine learning models for predicting multi-programming-language bugs in software systems, demonstrating the effectiveness of certain metrics and the feasibility of cross-project prediction.

Contribution

It introduces the first JIT MPLB prediction models using machine learning, identifies key metrics, and shows that cross-project models outperform within-project models.

Findings

01

Random Forest is most suitable for prediction.

02

Key metrics include changed LOC, added LOC, and total lines.

03

Cross-project training improves prediction performance.

Abstract

Context: An increasing number of software systems are written in multiple programming languages (PLs), which are called multi-programming-language (MPL) systems. MPL bugs (MPLBs) refers to the bugs whose resolution involves multiple PLs. Despite high complexity of MPLB resolution, there lacks MPLB prediction methods. Objective: This work aims to construct just-in-time (JIT) MPLB prediction models with selected prediction metrics, analyze the significance of the metrics, and then evaluate the performance of cross-project JIT MPLB prediction. Method: We develop JIT MPLB prediction models with the selected metrics using machine learning algorithms and evaluate the models in within-project and cross-project contexts with our constructed dataset based on 18 Apache MPL projects. Results: Random Forest is appropriate for JIT MPLB prediction. Changed LOC of all files, added LOC of all files,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Software System Performance and Reliability