Are Expressive Models Truly Necessary for Offline RL?

Guan Wang; Haoyi Niu; Jianxiong Li; Li Jiang; Jianming Hu; Xianyuan; Zhan

arXiv:2412.11253·cs.LG·December 17, 2024

Are Expressive Models Truly Necessary for Offline RL?

Guan Wang, Haoyi Niu, Jianxiong Li, Li Jiang, Jianming Hu, Xianyuan, Zhan

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper demonstrates that lightweight, simple models combined with recursive planning can outperform complex models in offline RL, achieving state-of-the-art results efficiently on long-horizon tasks.

Contribution

The authors introduce Recursive Skip-Step Planning (RSP), a method that uses simple models with recursive planning to match or surpass the performance of larger models in offline RL.

Findings

01

Lightweight models can achieve accurate dynamics with recursive planning.

02

RSP outperforms existing methods on D4RL benchmarks.

03

Significant efficiency gains with minimal model complexity.

Abstract

Among various branches of offline reinforcement learning (RL) methods, goal-conditioned supervised learning (GCSL) has gained increasing popularity as it formulates the offline RL problem as a sequential modeling task, therefore bypassing the notoriously difficult credit assignment challenge of value learning in conventional RL paradigm. Sequential modeling, however, requires capturing accurate dynamics across long horizons in trajectory data to ensure reasonable policy performance. To meet this requirement, leveraging large, expressive models has become a popular choice in recent literature, which, however, comes at the cost of significantly increased computation and inference latency. Contradictory yet promising, we reveal that lightweight models as simple as shallow 2-layer MLPs, can also enjoy accurate dynamics consistency and significantly reduced sequential modeling errors against…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

imoneoi/RSP_JAX
jaxOfficial

Videos

Are Expressive Models Truly Necessary for Offline RL?· underline

Taxonomy

TopicsMulti-Agent Systems and Negotiation