The Grand Software Supply Chain of AI Systems

Carmine Cesarano; Martin Monperrus

arXiv:2604.27781·cs.SE·May 1, 2026

The Grand Software Supply Chain of AI Systems

Carmine Cesarano, Martin Monperrus

PDF

TL;DR

This paper analyzes the AI software supply chain across four architectural layers, highlighting structural gaps like verifiability and traceability, and measures its complexity through open-source projects.

Contribution

It introduces a layered analysis of the AI supply chain and identifies key structural gaps not addressed by traditional mechanisms.

Findings

01

AI supply chain includes 4,664 direct dependencies and 11,508 transitive packages.

02

Current AI systems lack verifiability, versioning, observability, and traceability.

03

The reference stack comprises roughly 392 million lines of code.

Abstract

AI systems rest on software with low integrity mechanisms, leaving AI systems exposed across every stage from data acquisition to final inference. This paper makes the AI supply chain a first-class object of analysis, decomposing it across four architectural layers: data acquisition, model training, model inference, and a cross-cutting substrate. Within these layers, we identify four structural gaps that traditional supply chain mechanisms do not address: verifiability, versioning, observability, and traceability.Current AI systems fall short on all of them: they carry undeclared behavioral couplings that no resolver enforces; they cannot be reverted back to known working assemblies; they degrade silently rather than surfacing breaking changes; and their lineage can hardly be approximated. To illustrate the scale of the software supply chain of AI, we measure a reference stack of 48…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.