When Labels Are Scarce: A Systematic Mapping of Label-Efficient Code Vulnerability Detection

Noor Khalal; Chakib Fettal; Lazhar Labiod; Mohamed Nadif

arXiv:2604.00079·cs.CR·April 2, 2026

When Labels Are Scarce: A Systematic Mapping of Label-Efficient Code Vulnerability Detection

Noor Khalal, Chakib Fettal, Lazhar Labiod, Mohamed Nadif

PDF

TL;DR

This paper systematically reviews label-efficient methods for code vulnerability detection, addressing challenges of unreliable and scarce vulnerability labels across diverse projects and languages.

Contribution

It synthesizes five main approaches to label-efficient CVD, connecting mechanisms to various representations and providing a practical decision guide.

Findings

01

Maps five paradigm families of label-efficient CVD approaches.

02

Connects mechanisms to token, graph, hybrid, and knowledge-based representations.

03

Provides a decision guide for method selection based on trade-offs and failure modes.

Abstract

Machine-learning-based code vulnerability detection (CVD) has progressed rapidly, from deep program representations to pretrained code models and LLM-centered pipelines. Yet dependable vulnerability labeling remains expensive, noisy, and uneven across projects, languages, and CWE types, motivating approaches that reduce reliance on human labeling. This survey maps these approaches, synthesizing five paradigm families and the mechanisms they use. It connects mechanisms to token, graph, hybrid, and knowledgebased representations, and consolidates evaluation and reporting axes that limit comparison (label-budget specification, compute/cost assumptions, leakage, and granularity mismatches). A Design Map and constraintfirst Decision Guide distill trade-offs and failure modes for practical method selection.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.