AnchorVLA: Anchored Diffusion for Efficient End-to-End Mobile Manipulation

Jia Syuen Lim; Zhizhen Zhang; Peter Bohm; Brendan Tidd; Zi Huang; and Yadan Luo

arXiv:2604.01567·cs.RO·April 3, 2026

AnchorVLA: Anchored Diffusion for Efficient End-to-End Mobile Manipulation

Jia Syuen Lim, Zhizhen Zhang, Peter Bohm, Brendan Tidd, Zi Huang, and Yadan Luo

PDF

1 Repo

TL;DR

AnchorVLA is a diffusion-based policy for mobile manipulation that efficiently generates multimodal actions with low inference cost, improving success and stability in dynamic tasks.

Contribution

It introduces an anchored diffusion approach with a self-correction mechanism to enable reactive, multimodal control in mobile manipulation.

Findings

01

Improves success rates in diverse mobile manipulation tasks.

02

Reduces inference time compared to full diffusion models.

03

Enhances stability under disturbances and distribution shifts.

Abstract

A central challenge in mobile manipulation is preserving multiple plausible action models while remaining reactive during execution. A bottle in a cluttered scene can often be approached and grasped in multiple valid ways. Robust behavior depends on preserving this action diversity while remaining reactive as the scene evolves. Diffusion policies are appealing because they model multimodal action distributions rather than collapsing to one solution. But in practice, full iterative denoising is costly at control time. Action chunking helps amortize inference, yet it also creates partially open-loop behavior, allowing small mismatches to accumulate into drift. We present AnchorVLA, a diffusion-based VLA policy for mobile manipulation built on the core insight that when sampling begins near a plausible solution manifold, extensive denoising is unnecessary to recover multimodal, valid…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jason-lim26/AnchorVLA
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.