Conflicting Scores, Confusing Signals: An Empirical Study of Vulnerability Scoring Systems
Viktoria Koscinski, Mark Nelson, Ahmet Okutan, Robert Falso, Mehdi Mirakhorli

TL;DR
This empirical study compares four vulnerability scoring systems using real-world data, revealing significant disparities and highlighting the need for more consistent and transparent vulnerability risk assessments.
Contribution
It provides the first large-scale, outcome-linked empirical comparison of multiple vulnerability scoring systems using real-world data.
Findings
Significant disparities in vulnerability rankings across scoring systems
Scores often do not align with actual exploitation risks
Implications for organizations relying on these metrics for decision-making
Abstract
Accurately assessing software vulnerabilities is essential for effective prioritization and remediation. While various scoring systems exist to support this task, their differing goals, methodologies and outputs often lead to inconsistent prioritization decisions. This work provides the first large-scale, outcome-linked empirical comparison of four publicly available vulnerability scoring systems: the Common Vulnerability Scoring System (CVSS), the Stakeholder-Specific Vulnerability Categorization (SSVC), the Exploit Prediction Scoring System (EPSS), and the Exploitability Index. We use a dataset of 600 real-world vulnerabilities derived from four months of Microsoft's Patch Tuesday disclosures to investigate the relationships between these scores, evaluate how they support vulnerability management task, how these scores categorize vulnerabilities across triage tiers, and assess their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
