Characterizing Deep Learning Package Supply Chains in PyPI: Domains, Clusters, and Disengagement
Kai Gao, Runzhi He, Bing Xie, Minghui Zhou

TL;DR
This study analyzes PyPI's deep learning package supply chains, revealing their domain distributions, cluster structures, and reasons for package disengagement, providing insights into dependency management and maintenance practices.
Contribution
It offers a comprehensive characterization of DL package supply chains in PyPI, including domain coverage, cluster shapes, and disengagement reasons, based on large-scale metadata analysis.
Findings
Popular packages cover 34 domains across 8 categories.
Clusters mainly exhibit Arrow and Star shapes, with Tree and Forest being more complex.
Disengagement reasons include dependency issues, functional improvements, and installation ease.
Abstract
Deep learning (DL) package supply chains (SCs) are critical for DL frameworks to remain competitive. However, vital knowledge on the nature of DL package SCs is still lacking. In this paper, we explore the domains, clusters, and disengagement of packages in two representative PyPI DL package SCs to bridge this knowledge gap. We analyze the metadata of nearly six million PyPI package distributions and construct version-sensitive SCs for two popular DL frameworks: TensorFlow and PyTorch. We find that popular packages (measured by the number of monthly downloads) in the two SCs cover 34 domains belonging to eight categories. Applications, Infrastructure, and Sciences categories account for over 85% of popular packages in either SC and TensorFlow and PyTorch SC have developed specializations on Infrastructure and Applications packages respectively. We employ the Leiden community detection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecycling and Waste Management Techniques · Industrial Vision Systems and Defect Detection · Advancements in Photolithography Techniques
