Towards Long-Lived Robots: Continual Learning VLA Models via Reinforcement Fine-Tuning

Yuan Liu; Haoran Li; Shuai Tian; Yuxing Qin; Yuhui Chen; Yupeng Zheng; Yongzhen Huang; Dongbin Zhao

arXiv:2602.10503·cs.RO·May 19, 2026

Towards Long-Lived Robots: Continual Learning VLA Models via Reinforcement Fine-Tuning

Yuan Liu, Haoran Li, Shuai Tian, Yuxing Qin, Yuhui Chen, Yupeng Zheng, Yongzhen Huang, Dongbin Zhao

PDF

1 Repo

TL;DR

This paper introduces LifeLong-RFT, a reinforcement fine-tuning strategy for vision-language-robotics models that improves continual learning and task adaptation without requiring online feedback or reward models.

Contribution

It proposes a novel reinforcement fine-tuning method with multi-dimensional rewards that enhances multi-task and continual learning for VLA models.

Findings

01

Achieves a 22% success rate improvement over supervised fine-tuning on LIBERO.

02

Effectively adapts to new tasks with only 20% of the training data.

03

Demonstrates strong performance across simulated and real-world tasks.

Abstract

Pretrained on large-scale and diverse datasets, VLA models demonstrate strong generalization and adaptability as general-purpose robotic policies. However, Supervised Fine-Tuning (SFT), which serves as the primary mechanism for adapting VLAs to downstream domains, requires substantial amounts of task-specific data and is prone to catastrophic forgetting. To address these limitations, we propose LifeLong-RFT, a simple yet effective Reinforcement Fine-Tuning (RFT) strategy for VLA models independent of online environmental feedback and pre-trained reward models. By integrating chunking-level on-policy reinforcement learning with the proposed multi-dimensional process reward mechanism, LifeLong-RFT quantifies the heterogeneous contributions of intermediate action chunks across three dimensions to facilitate policy optimization. Specifically, (1) the Quantized Action Consistency Reward…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://yuan-liu-lifelong-rft.github.io
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications