Reasoning Portability: Guiding Continual Learning for MLLMs in the RLVR Era

Qiuhe Hong; Yuyang Liu; Shuo Yang; Tiantian Peng; Fei Zhu; Yonghong Tian

arXiv:2605.18903·cs.LG·May 20, 2026

Reasoning Portability: Guiding Continual Learning for MLLMs in the RLVR Era

Qiuhe Hong, Yuyang Liu, Shuo Yang, Tiantian Peng, Fei Zhu, Yonghong Tian

PDF

TL;DR

This paper introduces Reasoning Portability, a measure of how well previous reasoning strategies transfer to new tasks, and proposes a method to improve continual learning in multimodal models by dynamically balancing exploration and retention based on reasoning transferability.

Contribution

It formalizes reasoning portability as a measure for guiding continual learning and develops RDB-CL, a method that adaptively balances regularization based on reasoning transferability.

Findings

01

RDB-CL outperforms baselines with +12.0% last accuracy improvement.

02

Reasoning-level signals are more reliable than answer-level signals on out-of-distribution samples.

03

Dynamic balancing based on reasoning portability enhances continual learning performance.

Abstract

Vision-Language Models in Continual Learning (VLM-CL) aim to continuously adapt to new multimodal tasks while retaining prior knowledge. The emerging paradigm that couples Multimodal Large Language Models (MLLMs) with Reinforcement Learning with Verifiable Rewards (RLVR) calls for a new pattern to guide continual adaptation. Advances in reasoning capability now make it feasible to impose constraints at the reasoning level. We formalize portability, a sample-level measure of how reusable the previous policy's behavior is on a new task, and empirically show that reasoning-level signals remain reliable on out-of-distribution samples while answer-level signals do not. We instantiate this as Reasoning Portability (RP) and propose Reasoning-based Dynamic Balance Continual Learning (RDB-CL), which modulates the per-sample Kullback-Leibler regularization in RLVR according to RP: a tight anchor…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.