Decision Mamba: Reinforcement Learning via Sequence Modeling with Selective State Spaces
Toshihiro Ota

TL;DR
This paper explores integrating the Mamba sequence modeling framework into the Decision Transformer architecture to enhance reinforcement learning performance across various environments.
Contribution
It introduces Decision Mamba, a novel model combining Mamba with Decision Transformer, demonstrating potential performance improvements in sequential decision-making tasks.
Findings
Decision Mamba outperforms traditional Decision Transformer in experiments.
Mamba integration improves sequence modeling efficiency.
Neural network architecture impacts reinforcement learning effectiveness.
Abstract
Decision Transformer, a promising approach that applies Transformer architectures to reinforcement learning, relies on causal self-attention to model sequences of states, actions, and rewards. While this method has shown competitive results, this paper investigates the integration of the Mamba framework, known for its advanced capabilities in efficient and effective sequence modeling, into the Decision Transformer architecture, focusing on the potential performance enhancements in sequential decision-making tasks. Our study systematically evaluates this integration by conducting a series of experiments across various decision-making environments, comparing the modified Decision Transformer, Decision Mamba, with its traditional counterpart. This work contributes to the advancement of sequential decision-making models, suggesting that the architecture and training methodology of neural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Multi-Head Attention · Softmax · Dense Connections · Label Smoothing · Adam · Absolute Position Encodings
