Revisiting Pre-trained Language Models for Vulnerability Detection

Youpeng Li; Weiliang Qi; Xuyu Wang; Fuxun Yu; Xinda Wang

arXiv:2507.16887·cs.CR·November 25, 2025

Revisiting Pre-trained Language Models for Vulnerability Detection

Youpeng Li, Weiliang Qi, Xuyu Wang, Fuxun Yu, Xinda Wang

PDF

Open Access

TL;DR

This paper critically evaluates 18 pre-trained language models for vulnerability detection in code, highlighting their strengths, limitations, and challenges in real-world scenarios, and emphasizes the need for thorough practical assessments.

Contribution

It provides an extensive evaluation of PLMs on high-quality datasets, compares fine-tuning and prompt engineering, and analyzes robustness and generalizability in vulnerability detection.

Findings

01

Code-specific pre-training tasks improve PLM performance

02

PLMs struggle with complex dependencies and perturbations

03

Limited context windows cause labeling errors

Abstract

The rapid advancement of pre-trained language models (PLMs) has demonstrated promising results for various code-related tasks. However, their effectiveness in detecting real-world vulnerabilities remains a critical challenge. While existing empirical studies evaluate PLMs for vulnerability detection (VD), they suffer from data leakage, limited scope, and superficial analysis, hindering the accuracy and comprehensiveness of evaluations. This paper begins by revisiting the common issues in existing research on PLMs for VD through the evaluation pipeline. It then proceeds with an accurate and extensive evaluation of 18 PLMs on high-quality datasets that feature accurate labeling, diverse vulnerability types, and various projects. Specifically, we compare the performance of PLMs under both fine-tuning and prompt engineering, assess their effectiveness and generalizability across various…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection