SEMI: Self-supervised Exploration via Multisensory Incongruity
Jianren Wang, Ziwen Zhuang, Hang Zhao

TL;DR
SEMI introduces a self-supervised exploration method in reinforcement learning that uses multisensory incongruity as an intrinsic reward, improving exploration efficiency without external rewards.
Contribution
The paper proposes a novel intrinsic reward based on multisensory incongruity, combining perception and action incongruity for enhanced exploration in RL.
Findings
SEMI improves sample efficiency in various benchmarks.
It effectively combines perception and action incongruity signals.
The method enhances exploration in environments with sparse rewards.
Abstract
Efficient exploration is a long-standing problem in reinforcement learning since extrinsic rewards are usually sparse or missing. A popular solution to this issue is to feed an agent with novelty signals as intrinsic rewards. In this work, we introduce SEMI, a self-supervised exploration policy by incentivizing the agent to maximize a new novelty signal: multisensory incongruity, which can be measured in two aspects, perception incongruity and action incongruity. The former represents the misalignment of the multisensory inputs, while the latter represents the variance of an agent's policies under different sensory inputs. Specifically, an alignment predictor is learned to detect whether multiple sensory inputs are aligned, the error of which is used to measure perception incongruity. A policy model takes different combinations of the multisensory observations as input and outputs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultisensory perception and integration
MethodsDropout
