The Collapse of Patches
Wei Guo, Shunqi Mao, Zhuonan Liang, Heng Wang, Weidong Cai

TL;DR
This paper introduces the concept of patch collapse, where observing certain image patches reduces uncertainty in others, and demonstrates how leveraging this phenomenon can improve image modeling and classification efficiency.
Contribution
The paper proposes patch collapse as a new perspective in image modeling and develops methods to identify and utilize critical patches for improved vision tasks.
Findings
Autoregressive image generation is enhanced by retraining with patch dependencies.
High-rank patches suffice for accurate image classification, reducing data requirements.
Patch collapse promotes efficiency in vision models.
Abstract
Observing certain patches in an image reduces the uncertainty of others. Their realization lowers the distribution entropy of each remaining patch feature, analogous to collapsing a particle's wave function in quantum mechanics. This phenomenon can intuitively be called patch collapse. To identify which patches are most relied on during a target region's collapse, we learn an autoencoder that softly selects a subset of patches to reconstruct each target patch. Graphing these learned dependencies for each patch's PageRank score reveals the optimal patch order to realize an image. We show that respecting this order benefits various masked image modeling methods. First, autoregressive image generation can be boosted by retraining the state-of-the-art model MAR. Next, we introduce a new setup for image classification by exposing Vision Transformers only to high-rank patches in the collapse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
