(1D) Ordered Tokens Enable Efficient Test-Time Search

Zhitong Gao; Parham Rezaei; Ali Cy; Mingqiao Ye; Nata\v{s}a Jovanovi\'c; Jesse Allardice; Afshin Dehghan; Amir Zamir; Roman Bachmann; O\u{g}uzhan Fatih Kar

arXiv:2604.15453·cs.CV·April 20, 2026

(1D) Ordered Tokens Enable Efficient Test-Time Search

Zhitong Gao, Parham Rezaei, Ali Cy, Mingqiao Ye, Nata\v{s}a Jovanovi\'c, Jesse Allardice, Afshin Dehghan, Amir Zamir, Roman Bachmann, O\u{g}uzhan Fatih Kar

PDF

1 Repo

TL;DR

This paper investigates how ordered token structures, especially coarse-to-fine sequences, improve the efficiency and scalability of test-time search in autoregressive generative models, particularly for image generation.

Contribution

It demonstrates that coarse-to-fine ordered tokens enable more effective test-time search and training-free generation, offering practical guidance for inference scalability.

Findings

01

Coarse-to-fine tokens improve test-time scaling in AR models.

02

Ordered tokens enable training-free, search-based image generation.

03

Classical search algorithms interact differently with token structures.

Abstract

Tokenization is a key component of autoregressive (AR) generative models, converting raw data into more manageable units for modeling. Commonly, tokens describe local information, such as regions of pixels in images or word pieces in text, and AR generation predicts these tokens in a fixed order. A worthwhile question is whether token structures affect the ability to steer the generation through test-time search, where multiple candidate generations are explored and evaluated by a verifier. Using image generation as our testbed, we hypothesize that recent 1D ordered tokenizers with coarse-to-fine structure can be more amenable to search than classical 2D grid structures. This is rooted in the fact that the intermediate states in coarse-to-fine sequences carry semantic meaning that verifiers can reliably evaluate, enabling effective steering during generation. Through controlled…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

epfl-vilab/search-over-tokens
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.