Uncertainty-Aware Rank-One MIMO Q Network Framework for Accelerated Offline Reinforcement Learning

Thanh Nguyen; Tung Luu; Tri Ton; Sungwoong Kim; Chang D. Yoo

arXiv:2602.19917·cs.LG·February 24, 2026

Uncertainty-Aware Rank-One MIMO Q Network Framework for Accelerated Offline Reinforcement Learning

Thanh Nguyen, Tung Luu, Tri Ton, Sungwoong Kim, Chang D. Yoo

PDF

Open Access

TL;DR

This paper introduces an uncertainty-aware rank-one MIMO Q network framework that improves offline reinforcement learning by effectively leveraging out-of-distribution data while maintaining computational efficiency.

Contribution

It proposes a novel uncertainty quantification method using a rank-one MIMO architecture to enhance offline RL performance with lower computational costs.

Findings

01

Achieves state-of-the-art results on D4RL benchmarks.

02

Balances accuracy, speed, and memory efficiency.

03

Effectively mitigates extrapolation errors in offline RL.

Abstract

Offline reinforcement learning (RL) has garnered significant interest due to its safe and easily scalable paradigm. However, training under this paradigm presents its own challenge: the extrapolation error stemming from out-of-distribution (OOD) data. Existing methodologies have endeavored to address this issue through means like penalizing OOD Q-values or imposing similarity constraints on the learned policy and the behavior policy. Nonetheless, these approaches are often beset by limitations such as being overly conservative in utilizing OOD data, imprecise OOD data characterization, and significant computational overhead. To address these challenges, this paper introduces an Uncertainty-Aware Rank-One Multi-Input Multi-Output (MIMO) Q Network framework. The framework aims to enhance Offline Reinforcement Learning by fully leveraging the potential of OOD data while still ensuring…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Advanced Multi-Objective Optimization Algorithms