SeBERTis: A Framework for Producing Classifiers of Security-Related Issue Reports
Sogol Masoumzadeh, Yufei Li, Shane McIntosh, D\'aniel Varr\'o, Lili Wei

TL;DR
SeBERTis introduces a novel framework using fine-tuned transformer models to detect security-related issue reports in software repositories, significantly outperforming existing methods in accuracy and robustness.
Contribution
The paper presents SEBERTIS, a new approach that trains DNN classifiers on semantic surrogates to improve detection of unseen security issues, overcoming lexical cue reliance of prior models.
Findings
Achieved 0.9880 F1-score on a 10,000-issue dataset.
Outperformed state-of-the-art classifiers by up to 97% in detection metrics.
Surpassed LLM baselines with 23-64% higher precision, recall, and F1-score.
Abstract
Monitoring issue tracker submissions is a crucial software maintenance activity. A key goal is the prioritization of high risk, security-related bugs. If such bugs can be recognized early, the risk of propagation to dependent products and endangerment of stakeholder benefits can be mitigated. To assist triage engineers with this task, several automatic detection techniques, from Machine Learning (ML) models to prompting Large Language Models (LLMs), have been proposed. Although promising to some extent, prior techniques often memorize lexical cues as decision shortcuts, yielding low detection rate specifically for more complex submissions. As such, these classifiers do not yet reach the practical expectations of a real-time detector of security-related issues. To address these limitations, we propose SEBERTIS, a framework to train Deep Neural Networks (DNNs) as classifiers independent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Information and Cyber Security
