PriorVLA: Prior-Preserving Adaptation for Vision-Language-Action Models

Xinyu Guo; Bin Xie; Wei Chai; Xianchi Deng; Tiancai Wang; Zhengxing Wu; Xingyu Chen

arXiv:2605.10925·cs.RO·May 12, 2026

PriorVLA: Prior-Preserving Adaptation for Vision-Language-Action Models

Xinyu Guo, Bin Xie, Wei Chai, Xianchi Deng, Tiancai Wang, Zhengxing Wu, Xingyu Chen

PDF

TL;DR

PriorVLA introduces a method that preserves pretrained priors while adapting vision-language-action models for robot tasks, achieving better performance especially in out-of-distribution and few-shot scenarios.

Contribution

It proposes a novel framework that keeps a frozen prior source and trains a small adaptation module, improving adaptation efficiency and effectiveness over full fine-tuning.

Findings

01

PriorVLA outperforms full fine-tuning and state-of-the-art baselines.

02

Achieves 99.1% success on LIBERO benchmark.

03

Significant gains in out-of-distribution and few-shot settings.

Abstract

Large-scale pretraining has made Vision-Language-Action (VLA) models promising foundations for generalist robot manipulation, yet adapting them to downstream tasks remains necessary. However, the common practice of full fine-tuning treats pretraining as initialization and can shift broad priors toward narrow training-distribution patterns. We propose PriorVLA, a novel framework that preserves pretrained priors and learns to leverage them for effective adaptation. PriorVLA keeps a frozen Prior Expert as a read-only prior source and trains an Adaptation Expert for downstream specialization. Expert Queries capture scene priors from the pretrained VLM and motor priors from the Prior Expert, integrating both into the Adaptation Expert to guide adaptation. Together, PriorVLA updates only 25% of the parameters updated by full fine-tuning. Across RoboTwin 2.0, LIBERO, and real-world tasks,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.