VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification

C\'edric Bonhomme; Alexandre Dulaunoy

arXiv:2507.03607·cs.CR·July 8, 2025

VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification

C\'edric Bonhomme, Alexandre Dulaunoy

PDF

1 Models 1 Datasets

TL;DR

VLAI is a transformer-based model built on RoBERTa that automatically classifies software vulnerability severity from text, achieving high accuracy and aiding faster triage.

Contribution

This paper introduces VLAI, a novel transformer-based model fine-tuned on a large vulnerability dataset for automated severity classification.

Findings

01

Over 82% accuracy in severity prediction

02

Trained on 600,000 vulnerabilities

03

Open-source model and dataset available

Abstract

This paper presents VLAI, a transformer-based model that predicts software vulnerability severity levels directly from text descriptions. Built on RoBERTa, VLAI is fine-tuned on over 600,000 real-world vulnerabilities and achieves over 82% accuracy in predicting severity categories, enabling faster and more consistent triage ahead of manual CVSS scoring. The model and dataset are open-source and integrated into the Vulnerability-Lookup service.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
CIRCL/vulnerability-severity-classification-roberta-base
model· 1.3k dl· ♡ 8
1.3k dl♡ 8

Datasets

CIRCL/vulnerability-scores
dataset· 223 dl
223 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.