Salience-Invariant Consistent Policy Learning for Generalization in Visual Reinforcement Learning
Jingbo Sun, Songjun Tu, Qichao Zhang, Ke Chen, Dongbin Zhao

TL;DR
This paper introduces SCPL, a novel framework that enhances visual reinforcement learning generalization by focusing on task-relevant features and maintaining policy consistency across varied observations, leading to significant performance improvements.
Contribution
The paper proposes the Salience-Invariant Consistent Policy Learning (SCPL) algorithm, integrating value and dynamics modules with a policy consistency constraint to improve zero-shot generalization in visual RL.
Findings
SCPL outperforms state-of-the-art methods on multiple benchmarks.
Achieves up to 69% performance improvement in challenging environments.
Demonstrates the effectiveness of salience-guided and policy consistency modules.
Abstract
Generalizing policies to unseen scenarios remains a critical challenge in visual reinforcement learning, where agents often overfit to the specific visual observations of the training environment. In unseen environments, distracting pixels may lead agents to extract representations containing task-irrelevant information. As a result, agents may deviate from the optimal behaviors learned during training, thereby hindering visual generalization.To address this issue, we propose the Salience-Invariant Consistent Policy Learning (SCPL) algorithm, an efficient framework for zero-shot generalization. Our approach introduces a novel value consistency module alongside a dynamics module to effectively capture task-relevant representations. The value consistency module, guided by saliency, ensures the agent focuses on task-relevant pixels in both original and perturbed observations, while the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Visual Attention and Saliency Detection
MethodsEntropy Regularization · Proximal Policy Optimization · CARLA: An Open Urban Driving Simulator
