Cleaning the NVD: Comprehensive Quality Assessment, Improvements, and Analyses
Afsah Anwar, Ahmed Abusnaina, Songqing Chen, Frank Li and, David Mohaisen

TL;DR
This paper systematically assesses the quality of the NVD, identifies data inconsistencies, proposes automated correction methods, and demonstrates how data improvements can enhance security analysis accuracy.
Contribution
It uncovers data quality issues in the NVD, introduces automated correction techniques, and evaluates their impact on vulnerability analysis.
Findings
Identified significant inconsistencies in NVD data
Automated methods can effectively correct data discrepancies
Improved data quality enhances security analysis accuracy
Abstract
Vulnerability databases are vital sources of information on emergent software security concerns. Security professionals, from system administrators to developers to researchers, heavily depend on these databases to track vulnerabilities and analyze security trends. How reliable and accurate are these databases though? In this paper, we explore this question with the National Vulnerability Database (NVD), the U.S. government's repository of vulnerability information that arguably serves as the industry standard. Through a systematic investigation, we uncover inconsistent or incomplete data in the NVD that can impact its practical uses, affecting information such as the vulnerability publication dates, names of vendors and products affected, vulnerability severity scores, and vulnerability type categorizations. We explore the extent of these discrepancies and identify methods for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
