SUMO: Search-Based Uncertainty Estimation for Model-Based Offline   Reinforcement Learning

Zhongjian Qiao; Jiafei Lyu; Kechen Jiao; Qi Liu; Xiu Li

arXiv:2408.12970·cs.LG·November 13, 2024

SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning

Zhongjian Qiao, Jiafei Lyu, Kechen Jiao, Qi Liu, Xiu Li

PDF

Open Access 1 Video

TL;DR

SUMO introduces a search-based uncertainty estimation method for model-based offline reinforcement learning, improving the reliability of synthetic samples and enhancing overall algorithm performance.

Contribution

The paper proposes SUMO, a novel search-based uncertainty estimation technique that outperforms ensemble methods in model-based offline RL.

Findings

01

SUMO provides more accurate uncertainty estimates than ensemble methods.

02

Integrating SUMO boosts the performance of algorithms like MOPO and AMOReL.

03

Experimental results on D4RL datasets validate SUMO's effectiveness.

Abstract

The performance of offline reinforcement learning (RL) suffers from the limited size and quality of static datasets. Model-based offline RL addresses this issue by generating synthetic samples through a dynamics model to enhance overall performance. To evaluate the reliability of the generated samples, uncertainty estimation methods are often employed. However, model ensemble, the most commonly used uncertainty estimation method, is not always the best choice. In this paper, we propose a \textbf{S}earch-based \textbf{U}ncertainty estimation method for \textbf{M}odel-based \textbf{O}ffline RL (SUMO) as an alternative. SUMO characterizes the uncertainty of synthetic samples by measuring their cross entropy against the in-distribution dataset samples, and uses an efficient search-based method for implementation. In this way, SUMO can achieve trustworthy uncertainty estimation. We integrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning· underline

Taxonomy

TopicsSoftware Reliability and Analysis Research · Advanced Multi-Objective Optimization Algorithms

MethodsBalanced Selection