Uncertainty-aware Low-Rank Q-Matrix Estimation for Deep Reinforcement   Learning

Tong Sang; Hongyao Tang; Jianye Hao; Yan Zheng; Zhaopeng Meng

arXiv:2111.10103·cs.LG·November 22, 2021

Uncertainty-aware Low-Rank Q-Matrix Estimation for Deep Reinforcement Learning

Tong Sang, Hongyao Tang, Jianye Hao, Yan Zheng, Zhaopeng Meng

PDF

Open Access

TL;DR

This paper introduces UA-LQE, a novel method that leverages low-rank matrix reconstruction and uncertainty quantification to improve value estimation in deep reinforcement learning, especially in continuous control tasks.

Contribution

The paper reveals the low-rank structure of Q-matrices during learning and proposes a new uncertainty-aware low-rank estimation framework to enhance value function approximation.

Findings

01

Low-rank phenomenon observed across various algorithms and tasks.

02

Positive correlation between value matrix rank and estimation uncertainty.

03

UA-LQE improves learning efficiency in MuJoCo continuous control benchmarks.

Abstract

Value estimation is one key problem in Reinforcement Learning. Albeit many successes have been achieved by Deep Reinforcement Learning (DRL) in different fields, the underlying structure and learning dynamics of value function, especially with complex function approximation, are not fully understood. In this paper, we report that decreasing rank of $Q$ -matrix widely exists during learning process across a series of continuous control tasks for different popular algorithms. We hypothesize that the low-rank phenomenon indicates the common learning dynamics of $Q$ -matrix from stochastic high dimensional space to smooth low dimensional space. Moreover, we reveal a positive correlation between value matrix rank and value estimation uncertainty. Inspired by above evidence, we propose a novel Uncertainty-Aware Low-rank Q-matrix Estimation (UA-LQE) algorithm as a general framework to facilitate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Neural dynamics and brain function · Neural Networks and Reservoir Computing