Uncertainty-Aware Rank-One MIMO Q Network Framework for Accelerated Offline Reinforcement Learning
Thanh Nguyen, Tung Luu, Tri Ton, Sungwoong Kim, Chang D. Yoo

TL;DR
This paper introduces an uncertainty-aware rank-one MIMO Q network framework that improves offline reinforcement learning by effectively leveraging out-of-distribution data while maintaining computational efficiency.
Contribution
It proposes a novel uncertainty quantification method using a rank-one MIMO architecture to enhance offline RL performance with lower computational costs.
Findings
Achieves state-of-the-art results on D4RL benchmarks.
Balances accuracy, speed, and memory efficiency.
Effectively mitigates extrapolation errors in offline RL.
Abstract
Offline reinforcement learning (RL) has garnered significant interest due to its safe and easily scalable paradigm. However, training under this paradigm presents its own challenge: the extrapolation error stemming from out-of-distribution (OOD) data. Existing methodologies have endeavored to address this issue through means like penalizing OOD Q-values or imposing similarity constraints on the learned policy and the behavior policy. Nonetheless, these approaches are often beset by limitations such as being overly conservative in utilizing OOD data, imprecise OOD data characterization, and significant computational overhead. To address these challenges, this paper introduces an Uncertainty-Aware Rank-One Multi-Input Multi-Output (MIMO) Q Network framework. The framework aims to enhance Offline Reinforcement Learning by fully leveraging the potential of OOD data while still ensuring…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Advanced Multi-Objective Optimization Algorithms
