Salience-Invariant Consistent Policy Learning for Generalization in   Visual Reinforcement Learning

Jingbo Sun; Songjun Tu; Qichao Zhang; Ke Chen; Dongbin Zhao

arXiv:2502.08336·cs.AI·February 25, 2025

Salience-Invariant Consistent Policy Learning for Generalization in Visual Reinforcement Learning

Jingbo Sun, Songjun Tu, Qichao Zhang, Ke Chen, Dongbin Zhao

PDF

Open Access

TL;DR

This paper introduces SCPL, a novel framework that enhances visual reinforcement learning generalization by focusing on task-relevant features and maintaining policy consistency across varied observations, leading to significant performance improvements.

Contribution

The paper proposes the Salience-Invariant Consistent Policy Learning (SCPL) algorithm, integrating value and dynamics modules with a policy consistency constraint to improve zero-shot generalization in visual RL.

Findings

01

SCPL outperforms state-of-the-art methods on multiple benchmarks.

02

Achieves up to 69% performance improvement in challenging environments.

03

Demonstrates the effectiveness of salience-guided and policy consistency modules.

Abstract

Generalizing policies to unseen scenarios remains a critical challenge in visual reinforcement learning, where agents often overfit to the specific visual observations of the training environment. In unseen environments, distracting pixels may lead agents to extract representations containing task-irrelevant information. As a result, agents may deviate from the optimal behaviors learned during training, thereby hindering visual generalization.To address this issue, we propose the Salience-Invariant Consistent Policy Learning (SCPL) algorithm, an efficient framework for zero-shot generalization. Our approach introduces a novel value consistency module alongside a dynamics module to effectively capture task-relevant representations. The value consistency module, guided by saliency, ensures the agent focuses on task-relevant pixels in both original and perturbed observations, while the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Visual Attention and Saliency Detection

MethodsEntropy Regularization · Proximal Policy Optimization · CARLA: An Open Urban Driving Simulator