RLRC: Reinforcement Learning-based Recovery for Compressed Vision-Language-Action Models

Yuxuan Chen; Xiao Li

arXiv:2506.17639·cs.RO·June 24, 2025

RLRC: Reinforcement Learning-based Recovery for Compressed Vision-Language-Action Models

Yuxuan Chen, Xiao Li

PDF

TL;DR

This paper introduces RLRC, a three-stage recovery method for compressed vision-language-action models that significantly reduces memory and latency while maintaining or improving task success rates, enabling efficient on-device deployment.

Contribution

The paper presents RLRC, a novel three-stage recovery approach for compressed VLAs, combining structured pruning, reinforcement learning, and quantization, with extensive empirical validation.

Findings

01

RLRC achieves up to 8x memory reduction.

02

RLRC improves inference throughput by 2.3x.

03

RLRC outperforms existing compression methods.

Abstract

Vision-Language-Action models (VLA) have demonstrated remarkable capabilities and promising potential in solving complex robotic manipulation tasks. However, their substantial parameter sizes and high inference latency pose significant challenges for real-world deployment, particularly on resource-constrained robotic platforms. To address this issue, we begin by conducting an extensive empirical study to explore the effectiveness of model compression techniques when applied to VLAs. Building on the insights gained from these preliminary experiments, we propose RLRC, a three-stage recovery method for compressed VLAs, including structured pruning, performance recovery based on SFT and RL, and further quantization. RLRC achieves up to an 8x reduction in memory usage and a 2.3x improvement in inference throughput, while maintaining or even surpassing the original VLA's task success rate.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsShrink and Fine-Tune