Few-shot learning for security bug report identification

Muhammad Laiq

arXiv:2601.02971·cs.SE·January 7, 2026

Few-shot learning for security bug report identification

Muhammad Laiq

PDF

Open Access

TL;DR

This paper introduces a few-shot learning approach using SetFit to identify security bug reports effectively with limited labeled data, outperforming traditional methods and reducing annotation effort.

Contribution

The study demonstrates that SetFit-based few-shot learning significantly improves security bug report classification with minimal labeled data, addressing data scarcity issues.

Findings

01

Achieved an AUC of 0.865 in classification tasks

02

Outperformed traditional machine learning techniques

03

Proved effective with small labeled datasets

Abstract

Security bug reports require prompt identification to minimize the window of vulnerability in software systems. Traditional machine learning (ML) techniques for classifying bug reports to identify security bug reports rely heavily on large amounts of labeled data. However, datasets for security bug reports are often scarce in practice, leading to poor model performance and limited applicability in real-world settings. In this study, we propose a few-shot learning-based technique to effectively identify security bug reports using limited labeled data. We employ SetFit, a state-of-the-art few-shot learning framework that combines sentence transformers with contrastive learning and parameter-efficient fine-tuning. The model is trained on a small labeled dataset of bug reports and is evaluated on its ability to classify these reports as either security-related or non-security-related. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Information and Cyber Security