Simple Recipe Works: Vision-Language-Action Models are Natural Continual Learners with Reinforcement Learning

Jiaheng Hu; Jay Shim; Chen Tang; Yoonchang Sung; Bo Liu; Peter Stone; Roberto Martin-Martin

arXiv:2603.11653·cs.LG·March 13, 2026

Simple Recipe Works: Vision-Language-Action Models are Natural Continual Learners with Reinforcement Learning

Jiaheng Hu, Jay Shim, Chen Tang, Yoonchang Sung, Bo Liu, Peter Stone, Roberto Martin-Martin

PDF

Open Access

TL;DR

This paper demonstrates that simple sequential fine-tuning with low-rank adaptation on large pretrained vision-language-action models can effectively enable continual learning in reinforcement learning tasks, challenging the need for complex strategies.

Contribution

It shows that straightforward sequential fine-tuning with LoRA is surprisingly effective for continual RL, outperforming more complex methods and offering new insights into lifelong learning with large models.

Findings

01

Seq. FT with LoRA achieves high plasticity and stability.

02

Simple method often outperforms complex continual learning strategies.

03

Robustness stems from synergy between pretrained models, LoRA, and on-policy RL.

Abstract

Continual Reinforcement Learning (CRL) for Vision-Language-Action (VLA) models is a promising direction toward self-improving embodied agents that can adapt in openended, evolving environments. However, conventional wisdom from continual learning suggests that naive Sequential Fine-Tuning (Seq. FT) leads to catastrophic forgetting, necessitating complex CRL strategies. In this work, we take a step back and conduct a systematic study of CRL for large pretrained VLAs across three models and five challenging lifelong RL benchmarks. We find that, contrary to established belief, simple Seq. FT with low-rank adaptation (LoRA) is remarkably strong: it achieves high plasticity, exhibits little to no forgetting, and retains strong zero-shot generalization, frequently outperforming more sophisticated CRL methods. Through detailed analysis, we show that this robustness arises from a synergy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Reinforcement Learning in Robotics