Convolutional Neural Networks over Control Flow Graphs for Software Defect Prediction
Anh Viet Phan, Minh Le Nguyen, Lam Thu Bui

TL;DR
This paper introduces a novel approach for software defect prediction by using control flow graphs derived from assembly code and applying graph-based convolutional neural networks to automatically learn semantic features, leading to improved accuracy.
Contribution
It presents a new method combining control flow graphs with multi-view DGCNNs to better capture program semantics for defect prediction.
Findings
Significantly outperforms baseline models on four datasets.
Effectively captures program semantics through graph-based neural networks.
Demonstrates the advantage of using assembly-level control flow graphs.
Abstract
Existing defects in software components is unavoidable and leads to not only a waste of time and money but also many serious consequences. To build predictive models, previous studies focus on manually extracting features or using tree representations of programs, and exploiting different machine learning algorithms. However, the performance of the models is not high since the existing features and tree structures often fail to capture the semantics of programs. To explore deeply programs' semantics, this paper proposes to leverage precise graphs representing program execution flows, and deep neural networks for automatically learning defect features. Firstly, control flow graphs are constructed from the assembly instructions obtained by compiling source code; we thereafter apply multi-view multi-layer directed graph-based convolutional neural networks (DGCNNs) to learn semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Imbalanced Data Classification Techniques
