HiFlow: Tokenization-Free Scale-Wise Autoregressive Policy Learning via Flow Matching

Daichi Yashima; Koki Seno; Shuhei Kurita; Yusuke Oda; Komei Sugiura

arXiv:2603.27281·cs.RO·March 31, 2026

HiFlow: Tokenization-Free Scale-Wise Autoregressive Policy Learning via Flow Matching

Daichi Yashima, Koki Seno, Shuhei Kurita, Yusuke Oda, Komei Sugiura

PDF

TL;DR

HiFlow introduces a tokenization-free, hierarchical autoregressive policy that directly models continuous robot actions, improving efficiency and performance over existing tokenization-based methods.

Contribution

It proposes a novel hierarchical flow policy (HiFlow) that operates directly on continuous actions, eliminating the need for tokenization and multi-stage training.

Findings

01

HiFlow outperforms diffusion-based policies in experiments.

02

HiFlow achieves better results than tokenization-based autoregressive policies.

03

The model trains end-to-end in a single stage.

Abstract

Coarse-to-fine autoregressive modeling has recently shown strong promise for visuomotor policy learning, combining the inference efficiency of autoregressive methods with the global trajectory coherence of diffusion-based policies. However, existing approaches rely on discrete action tokenizers that map continuous action sequences to codebook indices, a design inherited from image generation where learned compression is necessary for high-dimensional pixel data. We observe that robot actions are inherently low-dimensional continuous vectors, for which such tokenization introduces unnecessary quantization error and a multi-stage training pipeline. In this work, we propose Hierarchical Flow Policy (HiFlow), a tokenization-free coarse-to-fine autoregressive policy that operates directly on raw continuous actions. HiFlow constructs multi-scale continuous action targets from each action…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.