On Computational Limits and Provably Efficient Criteria of Visual   Autoregressive Models: A Fine-Grained Complexity Analysis

Yekun Ke; Xiaoyu Li; Yingyu Liang; Zhizhou Sha; Zhenmei Shi; Zhao Song

arXiv:2501.04377·cs.LG·February 4, 2025

On Computational Limits and Provably Efficient Criteria of Visual Autoregressive Models: A Fine-Grained Complexity Analysis

Yekun Ke, Xiaoyu Li, Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song

PDF

Open Access

TL;DR

This paper analyzes the computational complexity of Visual Autoregressive models, establishing theoretical limits on their efficiency and proposing methods for more scalable image generation within these constraints.

Contribution

It provides the first fine-grained complexity analysis of VAR models, proving that sub-quadratic algorithms are unlikely under SETH and offering practical low-rank approximation strategies.

Findings

01

Sub-quadratic time complexity for VAR models is unlikely under SETH.

02

Efficient low-rank approximation constructions are compatible with theoretical criteria.

03

The work initiates a theoretical study of VAR model computational limits.

Abstract

Recently, Visual Autoregressive ( $VAR$ ) Models introduced a groundbreaking advancement in the field of image generation, offering a scalable approach through a coarse-to-fine ``next-scale prediction'' paradigm. Suppose that $n$ represents the height and width of the last VQ code map generated by $VAR$ models, the state-of-the-art algorithm in [Tian, Jiang, Yuan, Peng and Wang, NeurIPS 2024] takes $O (n^{4 + o (1)})$ time, which is computationally inefficient. In this work, we analyze the computational limits and efficiency criteria of $VAR$ Models through a fine-grained complexity lens. Our key contribution is identifying the conditions under which $VAR$ computations can achieve sub-quadratic time complexity. We have proved that assuming the Strong Exponential Time Hypothesis ( $SETH$ ) from fine-grained complexity theory, a sub-quartic time…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsALIGN