StagedVulBERT: Multi-Granular Vulnerability Detection with a Novel Pre-trained Code Model
Yuan Jiang, Yujian Zhang, Xiaohong Su, Christoph Treude, Tiantian, Wang

TL;DR
StagedVulBERT introduces a hierarchical, multi-granular vulnerability detection framework leveraging a novel pre-trained code model, improving accuracy in detecting vulnerabilities at both coarse and fine levels, especially with long code sequences.
Contribution
The paper presents CodeBERT-HLS, a new hierarchical encoding component within StagedVulBERT, enabling effective semantic capture and processing of long code sequences for vulnerability detection.
Findings
Achieves 92.26% F1 score in coarse vulnerability detection, a 6.58% improvement.
Attains 65.69% Top-5% accuracy in fine-grained detection, outperforming state-of-the-art by 75.17%.
Enhances multi-granular vulnerability detection performance with a novel hierarchical encoding approach.
Abstract
The emergence of pre-trained model-based vulnerability detection methods has significantly advanced the field of automated vulnerability detection. However, these methods still face several challenges, such as difficulty in learning effective feature representations of statements for fine-grained predictions and struggling to process overly long code sequences. To address these issues, this study introduces StagedVulBERT, a novel vulnerability detection framework that leverages a pre-trained code language model and employs a coarse-to-fine strategy. The key innovation and contribution of our research lies in the development of the CodeBERT-HLS component within our framework, specialized in hierarchical, layered, and semantic encoding. This component is designed to capture semantics at both the token and statement levels simultaneously, which is crucial for achieving more accurate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection
