Multimodal Information Bottleneck for Deep Reinforcement Learning with Multiple Sensors
Bang You, Huaping Liu

TL;DR
This paper introduces a multimodal information bottleneck approach for deep reinforcement learning that effectively fuses visual and proprioceptive data, improving sample efficiency and robustness in robotic locomotion tasks.
Contribution
It proposes a novel multimodal information bottleneck model that filters task-irrelevant information, enhancing joint representations from multiple sensory modalities in reinforcement learning.
Findings
Outperforms baseline methods in sample efficiency.
Demonstrates robustness to unseen noise.
Shows benefits of multimodal over single modality inputs.
Abstract
Reinforcement learning has achieved promising results on robotic control tasks but struggles to leverage information effectively from multiple sensory modalities that differ in many characteristics. Recent works construct auxiliary losses based on reconstruction or mutual information to extract joint representations from multiple sensory inputs to improve the sample efficiency and performance of reinforcement learning algorithms. However, the representations learned by these methods could capture information irrelevant to learning a policy and may degrade the performance. We argue that compressing information in the learned joint representations about raw multimodal observations is helpful, and propose a multimodal information bottleneck model to learn task-relevant joint representations from egocentric images and proprioception. Our model compresses and retains the predictive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
