Nuclear Norm Maximization Based Curiosity-Driven Learning

Chao Chen; Zijian Gao; Kele Xu; Sen Yang; Yiying Li; Bo Ding; Dawei; Feng; Huaimin Wang

arXiv:2205.10484·cs.LG·May 31, 2022·1 cites

Nuclear Norm Maximization Based Curiosity-Driven Learning

Chao Chen, Zijian Gao, Kele Xu, Sen Yang, Yiying Li, Bo Ding, Dawei, Feng, Huaimin Wang

PDF

Open Access

TL;DR

This paper introduces a nuclear norm maximization approach for curiosity-driven reinforcement learning, effectively quantifying exploration novelty with high noise tolerance, leading to state-of-the-art results on benchmark environments.

Contribution

The paper proposes a novel curiosity method based on nuclear norm maximization that improves noise robustness and exploration accuracy in reinforcement learning.

Findings

01

Achieves human-normalized score of 1.09 on 26 Atari games with intrinsic reward.

02

Outperforms previous curiosity methods in benchmark tests.

03

Demonstrates robustness to environment stochasticity.

Abstract

To handle the sparsity of the extrinsic rewards in reinforcement learning, researchers have proposed intrinsic reward which enables the agent to learn the skills that might come in handy for pursuing the rewards in the future, such as encouraging the agent to visit novel states. However, the intrinsic reward can be noisy due to the undesirable environment's stochasticity and directly applying the noisy value predictions to supervise the policy is detrimental to improve the learning performance and efficiency. Moreover, many previous studies employ $ℓ^{2}$ norm or variance to measure the exploration novelty, which will amplify the noise due to the square operation. In this paper, we address aforementioned challenges by proposing a novel curiosity leveraging the nuclear norm maximization (NNM), which can quantify the novelty of exploring the environment more accurately while providing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics