Fine-tuning RoBERTa for CVE-to-CWE Classification: A 125M Parameter Model Competitive with LLMs

Nikita Mosievskiy

arXiv:2603.14911·cs.CR·March 17, 2026

Fine-tuning RoBERTa for CVE-to-CWE Classification: A 125M Parameter Model Competitive with LLMs

Nikita Mosievskiy

PDF

Open Access 1 Models 1 Datasets

TL;DR

This paper fine-tunes a RoBERTa-base model to classify CVE descriptions into CWE categories, achieving high accuracy with significantly fewer parameters than large language models, and provides a large dataset and code for the community.

Contribution

The paper introduces a large-scale dataset and a fine-tuned RoBERTa model for CVE-to-CWE classification, demonstrating competitive performance with much smaller models.

Findings

01

Achieves 87.4% top-1 accuracy on test set.

02

Macro F1 score of 60.7%, outperforming TF-IDF baseline.

03

Performs comparably to larger models on external benchmark.

Abstract

We present a fine-tuned RoBERTa-base classifier (125M parameters) for mapping Common Vulnerabilities and Exposures (CVE) descriptions to Common Weakness Enumeration (CWE) categories. We construct a large-scale training dataset of 234,770 CVE descriptions with AI-refined CWE labels using Claude Sonnet 4.6, and agreement-filtered evaluation sets where NVD and AI labels agree. On our held-out test set (27,780 samples, 205 CWE classes), the model achieves 87.4% top-1 accuracy and 60.7% Macro F1 -- a +15.5 percentage-point Macro F1 gain over a TF-IDF baseline that already reaches 84.9% top-1, demonstrating the model's advantage on rare weakness categories. On the external CTI-Bench benchmark (NeurIPS 2024), the model achieves 75.6% strict accuracy (95% CI: 72.8-78.2%) -- statistically indistinguishable from Cisco Foundation-Sec-8B-Reasoning (75.3%, 8B parameters) at 64x fewer parameters. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
xamxte/cwe-classifier-roberta-base
model· 82 dl
82 dl

Datasets

xamxte/cve-to-cwe
dataset· 66 dl
66 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Information and Cyber Security · Advanced Malware Detection Techniques