Local Feature Swapping for Generalization in Reinforcement Learning
David Bertoin (IMT), Emmanuel Rachelson (DMIA)

TL;DR
This paper introduces CLOP, a regularization method involving local feature map permutations, which enhances generalization and robustness of reinforcement learning agents to visual changes, outperforming existing techniques.
Contribution
The paper proposes CLOP, a novel regularization technique using channel-consistent local permutations to improve generalization in RL and supervised learning.
Findings
CLOP improves robustness to visual variations in RL agents.
CLOP outperforms other regularization methods on the Procgen Benchmark.
CLOP is effective in supervised learning scenarios.
Abstract
Over the past few years, the acceleration of computing resources and research in deep learning has led to significant practical successes in a range of tasks, including in particular in computer vision. Building on these advances, reinforcement learning has also seen a leap forward with the emergence of agents capable of making decisions directly from visual observations. Despite these successes, the over-parametrization of neural architectures leads to memorization of the data used during training and thus to a lack of generalization. Reinforcement learning agents based on visual inputs also suffer from this phenomenon by erroneously correlating rewards with unrelated visual features such as background elements. To alleviate this problem, we introduce a new regularization technique consisting of channel-consistent local permutations (CLOP) of the feature maps. The proposed permutations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques
