Reinforced Curriculum Pre-Alignment for Domain-Adaptive VLMs
Yuming Yan, Shuo Yang, Kai Tang, Sihong Chen, Yang Zhang, Ke Xu, Dan Hu, Qun Yu, Pengfei Hu, Edith C.H. Ngai

TL;DR
This paper introduces RCPA, a curriculum-based post-training method for adapting vision-language models to specialized domains while maintaining their general capabilities, addressing catastrophic forgetting and optimization challenges.
Contribution
The paper proposes Reinforced Curriculum Pre-Alignment (RCPA), a novel staged adaptation approach that safely exposes models to new domains and preserves their general abilities.
Findings
RCPA outperforms existing methods in domain-specific tasks.
It effectively balances domain adaptation with general capability retention.
Experimental results validate RCPA's robustness across multiple benchmarks.
Abstract
Vision-Language Models (VLMs) demonstrate remarkable general-purpose capabilities but often fall short in specialized domains such as medical imaging or geometric problem-solving. Supervised Fine-Tuning (SFT) can enhance performance within a target domain, but it typically causes catastrophic forgetting, limiting its generalization. The central challenge, therefore, is to adapt VLMs to new domains while preserving their general-purpose capabilities. Continual pretraining is effective for expanding knowledge in Large Language Models (LLMs), but it is less feasible for VLMs due to prohibitive computational costs and the unavailability of pretraining data for most open-source models. This necessitates efficient post-training adaptation methods. Reinforcement learning (RL)-based approaches such as Group Relative Policy Optimization (GRPO) have shown promise in preserving general abilities,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling
