Uncertainty-aware Low-Rank Q-Matrix Estimation for Deep Reinforcement Learning
Tong Sang, Hongyao Tang, Jianye Hao, Yan Zheng, Zhaopeng Meng

TL;DR
This paper introduces UA-LQE, a novel method that leverages low-rank matrix reconstruction and uncertainty quantification to improve value estimation in deep reinforcement learning, especially in continuous control tasks.
Contribution
The paper reveals the low-rank structure of Q-matrices during learning and proposes a new uncertainty-aware low-rank estimation framework to enhance value function approximation.
Findings
Low-rank phenomenon observed across various algorithms and tasks.
Positive correlation between value matrix rank and estimation uncertainty.
UA-LQE improves learning efficiency in MuJoCo continuous control benchmarks.
Abstract
Value estimation is one key problem in Reinforcement Learning. Albeit many successes have been achieved by Deep Reinforcement Learning (DRL) in different fields, the underlying structure and learning dynamics of value function, especially with complex function approximation, are not fully understood. In this paper, we report that decreasing rank of -matrix widely exists during learning process across a series of continuous control tasks for different popular algorithms. We hypothesize that the low-rank phenomenon indicates the common learning dynamics of -matrix from stochastic high dimensional space to smooth low dimensional space. Moreover, we reveal a positive correlation between value matrix rank and value estimation uncertainty. Inspired by above evidence, we propose a novel Uncertainty-Aware Low-rank Q-matrix Estimation (UA-LQE) algorithm as a general framework to facilitate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Neural dynamics and brain function · Neural Networks and Reservoir Computing
