Predicting known Vulnerabilities from Attack News: A Transformer-Based Approach
Refat Othman, Diaeddin Rimawi, Bruno Rossi, Barbara Russo

TL;DR
This paper presents a transformer-based semantic similarity approach to predict CVEs from cybersecurity news, achieving high precision and relevance in identifying vulnerabilities exploited during cyberattacks.
Contribution
It introduces a novel method using MPNet sentence transformer to link attack news descriptions with known vulnerabilities, validated through multiple accuracy assessments.
Findings
81% precision with threshold filtering
70% manual relevance rate
57% reports with exact CVE match
Abstract
Identifying the vulnerabilities exploited during cyberattacks is essential for enabling timely responses and effective mitigation in software security. This paper directly examines the process of predicting software vulnerabilities, specifically Common Vulnerabilities and Exposures (CVEs), from unstructured descriptions of attacks reported in cybersecurity news articles. We propose a semantic similarity-based approach utilizing the multi-qa-mpnet-base-dot-v1 (MPNet) sentence transformer model to generate a ranked list of the most likely CVEs corresponding to each news report. To assess the accuracy of the predicted vulnerabilities, we implement four complementary validation methods: filtering predictions based on similarity thresholds, conducting manual validation, performing semantic comparisons with the first vulnerability explicitly mentioned in each report, and comparing against all…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation and Cyber Security · Web Application Security Vulnerabilities · Cybercrime and Law Enforcement Studies
