On-the-Fly VLA Adaptation via Test-Time Reinforcement Learning
Changyu Liu, Yiyang Liu, Taowen Wang, Qiao Zhuang, James Chenhao Liang, Wenhao Yang, Renjing Xu, Qifan Wang, Dongfang Liu, Cheng Han

TL;DR
This paper introduces TT-VLA, a test-time reinforcement learning framework that enables vision-language-action models to adapt on-the-fly during inference, improving their robustness in dynamic environments.
Contribution
The paper presents a novel test-time RL method for VLAs that allows real-time policy adaptation without retraining, enhancing deployment flexibility.
Findings
Improved adaptability and stability in unseen scenarios.
Enhanced task success rates in dynamic environments.
Effective in both simulated and real-world settings.
Abstract
Vision-Language-Action models have recently emerged as a powerful paradigm for general-purpose robot learning, enabling agents to map visual observations and natural-language instructions into executable robotic actions. Though popular, they are primarily trained via supervised fine-tuning or training-time reinforcement learning, requiring explicit fine-tuning phases, human interventions, or controlled data collection. Consequently, existing methods remain unsuitable for challenging simulated- or physical-world deployments, where robots must respond autonomously and flexibly to evolving environments. To address this limitation, we introduce a Test-Time Reinforcement Learning for VLAs (TT-VLA), a framework that enables on-the-fly policy adaptation during inference. TT-VLA formulates a dense reward mechanism that leverages step-by-step task-progress signals to refine action policies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
