Decision Mamba: Reinforcement Learning via Sequence Modeling with   Selective State Spaces

Toshihiro Ota

arXiv:2403.19925·cs.LG·April 1, 2024·2 cites

Decision Mamba: Reinforcement Learning via Sequence Modeling with Selective State Spaces

Toshihiro Ota

PDF

Open Access 1 Repo

TL;DR

This paper explores integrating the Mamba sequence modeling framework into the Decision Transformer architecture to enhance reinforcement learning performance across various environments.

Contribution

It introduces Decision Mamba, a novel model combining Mamba with Decision Transformer, demonstrating potential performance improvements in sequential decision-making tasks.

Findings

01

Decision Mamba outperforms traditional Decision Transformer in experiments.

02

Mamba integration improves sequence modeling efficiency.

03

Neural network architecture impacts reinforcement learning effectiveness.

Abstract

Decision Transformer, a promising approach that applies Transformer architectures to reinforcement learning, relies on causal self-attention to model sequences of states, actions, and rewards. While this method has shown competitive results, this paper investigates the integration of the Mamba framework, known for its advanced capabilities in efficient and effective sequence modeling, into the Decision Transformer architecture, focusing on the potential performance enhancements in sequential decision-making tasks. Our study systematically evaluates this integration by conducting a series of experiments across various decision-making environments, comparing the modified Decision Transformer, Decision Mamba, with its traditional counterpart. This work contributes to the advancement of sequential decision-making models, suggesting that the architecture and training methodology of neural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

toshihiro-ota/decision-mamba
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Multi-Head Attention · Softmax · Dense Connections · Label Smoothing · Adam · Absolute Position Encodings