Why Inference in Large Models Becomes Decomposable After Training

Jidong Jin

arXiv:2601.15871·cs.LG·March 17, 2026

Why Inference in Large Models Becomes Decomposable After Training

Jidong Jin

PDF

Open Access

TL;DR

This paper reveals that large AI models' inference systems become decomposable after training due to localized gradient updates, enabling efficient parallel inference without altering model functionality.

Contribution

It introduces a post-training criterion and structural annealing method to identify and leverage stable substructures for decomposable inference.

Findings

01

Inference systems are structurally non-uniform post-training.

02

Gradient updates are highly localized and leave many dependencies unchanged.

03

Decomposable inference enables parallel processing without model modification.

Abstract

Inference in large-scale AI models is typically performed on dense parameter matrices, leading to inference cost and system complexity that scale unsustainably with model size. This limitation does not arise from insufficient model capacity, but from treating post-training inference systems as monolithic operators while ignoring internal structures formed during learning. We show that gradient update events in large models are highly localized and selective, leaving many parameter dependencies statistically indistinguishable from their initialization distribution after training. As a result, post-training inference systems are structurally non-uniform and inherently decomposable. Based on this observation, we introduce a post-training statistical criterion and a structural annealing procedure that removes unsupported dependencies and reveals stable, independent substructures. This work…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning