# A dataset on vulnerabilities affecting dependencies in software package managers

**Authors:** A. Germán Márquez, Ángel Jesús Varela-Vaca, María Teresa Gómez López

PMC · DOI: 10.1016/j.dib.2025.111903 · 2025-07-21

## TL;DR

This paper introduces a dataset mapping vulnerabilities in dependencies across major software package managers to help improve software supply chain security.

## Contribution

The paper provides a comprehensive, structured dataset of vulnerabilities in dependencies for NPM, PyPI, Cargo, and RubyGems.

## Key findings

- 6.93% of NPM versions rely on at least one vulnerable dependency, the highest among the package managers studied.
- The dataset includes 270,430 known vulnerabilities linked to package versions, enabling detailed security risk analysis.
- NPM has 14,858 latest versions affected by vulnerabilities, significantly more than the other package managers.

## Abstract

The increasing reliance on third-party dependencies in software development introduces significant security risk challenges. This study presents a dataset that maps the vulnerabilities that affect dependencies in three major package managers: Node Package Manager (NPM), Python Package Index (PyPI), Cargo Crates and RubyGems. The dataset comprises information on 4437,679 unique packages and 60,950,846 versions of packages, with vulnerability data sourced from Open Source Vulnerabilities (OSV). It includes 270,430 known vulnerabilities linked to package versions, allowing a detailed analysis of security risks in software supply chains. Our methodology involved extracting dependency and version data from official package manager sources, correlating them with vulnerability reports, and storing the results in structured formats, including CSV and database dumps. The resultant dataset enables automated monitoring of vulnerable dependencies, facilitating analysis and security assessments, and defining mitigation strategies. This work identifies that 0.42 % of PyPI, 7.5 % of RubyGems, 3.91 % of Cargo and 6.93 % NPM versions rely on at least one vulnerable dependency. Furthermore, PyPI has 329 latest versions affected, RubyGem 919, Cargo 53, and NPM 14,858. This dataset provides valuable information for researchers, developers, and security professionals looking to improve software supply chain security. It provides a foundation for developing tools aimed at security and data analytics, enabling early vulnerability detection and improving mitigation controls for dependency-related security risks, thus promoting more secure software ecosystems. The dataset can be extended by incorporating additional packages, introducing new features, and ensuring continuous updates.

## Full-text entities

- **Diseases:** NPM (MESH:D012804), dependency (MESH:D019966), SSC (MESH:D007161)
- **Chemicals:** PYSEC-2023-132 (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12329217/full.md

---
Source: https://tomesphere.com/paper/PMC12329217