Jump-Start Reinforcement Learning with Vision-Language-Action Regularization
Angelo Moroncelli, Roberto Zanetti, Marco Maccarini, Loris Roveda

TL;DR
This paper introduces VLAJS, a method that combines vision-language-action guidance with reinforcement learning to enhance exploration and efficiency in robotic manipulation tasks, achieving significant sample savings and robust real-world performance.
Contribution
VLAJS is a novel approach that integrates sparse VLA guidance with on-policy RL, improving exploration and credit assignment without requiring demonstrations or continuous queries.
Findings
VLAJS outperforms PPO and baselines in sample efficiency, reducing interactions by over 50%.
The method enables zero-shot sim-to-real transfer in robotic tasks.
VLAJS demonstrates robust real-world manipulation under various conditions.
Abstract
Reinforcement learning (RL) enables high-frequency, closed-loop control for robotic manipulation, but scaling to long-horizon tasks with sparse or imperfect rewards remains difficult due to inefficient exploration and poor credit assignment. Vision-Language-Action (VLA) models leverage large-scale multimodal pretraining to provide generalist, task-level reasoning, but current limitations hinder their direct use in fast and precise manipulation. In this paper, we propose Vision-Language-Action Jump-Starting (VLAJS), a method that bridges sparse VLA guidance with on-policy RL to improve exploration and learning efficiency. VLAJS treats VLAs as transient sources of high-level action suggestions that bias early exploration and improve credit assignment, while preserving the high-frequency, state-based control of RL. Our approach augments Proximal Policy Optimization (PPO) with a directional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
