On-the-Fly VLA Adaptation via Test-Time Reinforcement Learning

Changyu Liu; Yiyang Liu; Taowen Wang; Qiao Zhuang; James Chenhao Liang; Wenhao Yang; Renjing Xu; Qifan Wang; Dongfang Liu; Cheng Han

arXiv:2601.06748·cs.RO·April 8, 2026

On-the-Fly VLA Adaptation via Test-Time Reinforcement Learning

Changyu Liu, Yiyang Liu, Taowen Wang, Qiao Zhuang, James Chenhao Liang, Wenhao Yang, Renjing Xu, Qifan Wang, Dongfang Liu, Cheng Han

PDF

TL;DR

This paper introduces TT-VLA, a test-time reinforcement learning framework that enables vision-language-action models to adapt on-the-fly during inference, improving their robustness in dynamic environments.

Contribution

The paper presents a novel test-time RL method for VLAs that allows real-time policy adaptation without retraining, enhancing deployment flexibility.

Findings

01

Improved adaptability and stability in unseen scenarios.

02

Enhanced task success rates in dynamic environments.

03

Effective in both simulated and real-world settings.

Abstract

Vision-Language-Action models have recently emerged as a powerful paradigm for general-purpose robot learning, enabling agents to map visual observations and natural-language instructions into executable robotic actions. Though popular, they are primarily trained via supervised fine-tuning or training-time reinforcement learning, requiring explicit fine-tuning phases, human interventions, or controlled data collection. Consequently, existing methods remain unsuitable for challenging simulated- or physical-world deployments, where robots must respond autonomously and flexibly to evolving environments. To address this limitation, we introduce a Test-Time Reinforcement Learning for VLAs (TT-VLA), a framework that enables on-the-fly policy adaptation during inference. TT-VLA formulates a dense reward mechanism that leverages step-by-step task-progress signals to refine action policies…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.