Decision Mamba: A Multi-Grained State Space Model with Self-Evolution   Regularization for Offline RL

Qi Lv; Xiang Deng; Gongwei Chen; Michael Yu Wang; Liqiang Nie

arXiv:2406.05427·cs.LG·January 23, 2025·1 cites

Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL

Qi Lv, Xiang Deng, Gongwei Chen, Michael Yu Wang, Liqiang Nie

PDF

Open Access 1 Repo

TL;DR

Decision Mamba introduces a multi-grained state space model with self-evolutionary regularization, effectively addressing out-of-distribution issues and overfitting in offline RL by leveraging historical information and local relationships.

Contribution

It proposes a novel multi-grained state space model with a self-evolving policy, explicitly modeling temporal and local relationships to improve offline RL performance.

Findings

01

Outperforms baseline methods on various tasks

02

Effectively handles noisy trajectories and overfitting

03

Enhances robustness with self-evolving policy

Abstract

While the conditional sequence modeling with the transformer architecture has demonstrated its effectiveness in dealing with offline reinforcement learning (RL) tasks, it is struggle to handle out-of-distribution states and actions. Existing work attempts to address this issue by data augmentation with the learned policy or adding extra constraints with the value-based RL algorithm. However, these studies still fail to overcome the following challenges: (1) insufficiently utilizing the historical temporal information among inter-steps, (2) overlooking the local intrastep relationships among return-to-gos (RTGs), states, and actions, (3) overfitting suboptimal trajectories with noisy labels. To address these challenges, we propose Decision Mamba (DM), a novel multi-grained state space model (SSM) with a self-evolving policy learning strategy. DM explicitly models the historical hidden…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aopolin-lv/DecisionMamba
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Simulation Techniques and Applications · Reinforcement Learning in Robotics