Circuit Complexity Bounds for Visual Autoregressive Model

Yekun Ke; Xiaoyu Li; Yingyu Liang; Zhenmei Shi; Zhao Song

arXiv:2501.04299·stat.ML·January 9, 2025

Circuit Complexity Bounds for Visual Autoregressive Model

Yekun Ke, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song

PDF

Open Access 3 Reviews

TL;DR

This paper establishes circuit complexity bounds for the Visual AutoRegressive (VAR) model, revealing its limitations in expressive power despite high performance in image generation tasks.

Contribution

It provides the first rigorous circuit complexity bounds for VAR models, linking them to uniform TC^0 threshold circuits and highlighting their expressive limitations.

Findings

01

VAR model is equivalent to a uniform TC^0 threshold circuit with certain parameters

02

First rigorous complexity bounds for VAR models

03

Highlights inherent limitations in VAR's expressive power

Abstract

Understanding the expressive ability of a specific model is essential for grasping its capacity limitations. Recently, several studies have established circuit complexity bounds for Transformer architecture. Besides, the Visual AutoRegressive (VAR) model has risen to be a prominent method in the field of image generation, outperforming previous techniques, such as Diffusion Transformers, in generating high-quality images. We investigate the circuit complexity of the VAR model and establish a bound in this study. Our primary result demonstrates that the VAR model is equivalent to a simulation by a uniform $TC^{0}$ threshold circuit with hidden dimension $d \leq O (n)$ and $poly (n)$ precision. This is the first study to rigorously highlight the limitations in the expressive power of VAR models despite their impressive performance. We believe our findings will offer valuable…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 1

Strengths

Due to significant differences between my research domain and the circuit complexity/visual autoregressive modeling field of this paper, I am unable to provide a substantive assessment of its originality, quality, clarity, and significance from a professional perspective. The paper appears to address an underexplored gap (circuit complexity analysis of VAR models) and presents a structured theoretical framework with detailed definitions and proofs, which suggests careful academic rigor. However,

Weaknesses

See Strengths.

Reviewer 02Rating 2Confidence 3

Strengths

The paper provides a technical result that extends known TC0 bounds to VAR models.

Weaknesses

1. The paper largely follows existing circuit complexity analysis techniques developed for Transformers and Mamba, offering limited novel methodological or theoretical advancements. 2. While the paper claims theoretical limitations for VAR, it fails to reconcile this with its strong empirical performance, leaving the tension between theory and practice unaddressed. 3. What practical implications does the paper’s conclusion that VAR models lie within TC0 have for real-world modeling or algorithm

Reviewer 03Rating 2Confidence 4

Strengths

- The paper attempts to formalize the components of VAR models and provides some logical structure. - The authors reference existing circuit complexity results and applies them to the VAR setting.

Weaknesses

- There is substantial overlap (e.g., Figure 1) with another ICLR submission (submission 2833), both focusing on theoretical complexity of VARs and even using exactly the same figures. This strong similarity raise concerns about originality and possible being written by LLMs. - The contribution is mainly upper bounds. no lower bounds or separation results are provided, limiting theoretical novelty. - Some key assumptions (e.g., constant depth/layers) in this paper do not match real-world VAR c

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Machine Learning and Data Classification · Industrial Vision Systems and Defect Detection

MethodsAttention Is All You Need · Absolute Position Encodings · Softmax · Linear Layer · Adam · Residual Connection · Dropout · Multi-Head Attention · Position-Wise Feed-Forward Layer · Diffusion