Just-In-Time Software Defect Prediction via Bi-modal Change   Representation Learning

Yuze Jiang; Beijun Shen; Xiaodong Gu

arXiv:2410.12107·cs.SE·October 17, 2024

Just-In-Time Software Defect Prediction via Bi-modal Change Representation Learning

Yuze Jiang, Beijun Shen, Xiaodong Gu

PDF

1 Repo

TL;DR

This paper introduces BiCC-BERT, a bi-modal change representation model pre-trained on code and commit messages, significantly improving just-in-time defect prediction accuracy by capturing deeper semantic information.

Contribution

The paper proposes a novel bi-modal pre-training model, BiCC-BERT, with a new RMI objective, and integrates it into JIT defect prediction, outperforming existing methods.

Findings

01

JIT-BiCC achieves 10.8% higher F1-score than baselines.

02

BiCC-BERT effectively captures semantic relations between code changes and messages.

03

The approach demonstrates the importance of natural language semantics in defect prediction.

Abstract

For predicting software defects at an early stage, researchers have proposed just-in-time defect prediction (JIT-DP) to identify potential defects in code commits. The prevailing approaches train models to represent code changes in history commits and utilize the learned representations to predict the presence of defects in the latest commit. However, existing models merely learn editions in source code, without considering the natural language intentions behind the changes. This limitation hinders their ability to capture deeper semantics. To address this, we introduce a novel bi-modal change pre-training model called BiCC-BERT. BiCC-BERT is pre-trained on a code change corpus to learn bi-modal semantic representations. To incorporate commit messages from the corpus, we design a novel pre-training objective called Replaced Message Identification (RMI), which learns the semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jyz-1201/jit-bicc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.