MATE: Solving Contextual Markov Decision Processes with Memory of Accumulated Transition Embeddings
Himchan Hwang, Hyeokju Jeong, Gene Chung, Seungyeon Kim, Sangwoong Yoon, Frank Chongwoo Park

TL;DR
MATE introduces a memory architecture for CMDPs that efficiently approximates the posterior over contexts, offering computational benefits and competitive performance without the drawbacks of Transformers or RNNs.
Contribution
The paper presents MATE, a novel memory-based approach that effectively solves CMDPs by replacing intractable posteriors with sum-aggregated memories, avoiding common issues of existing methods.
Findings
MATE achieves comparable performance to sequence-model baselines.
MATE offers computational advantages over Transformers and RNNs.
Extensive benchmarks validate MATE's effectiveness across diverse tasks.
Abstract
We propose MATE, a simple yet effective memory architecture for solving Contextual Markov Decision Processes (CMDPs), a family of MDPs parameterized by an unobserved context. In CMDPs, an optimal agent can adapt online by maintaining the posterior belief over contexts. MATE replaces this intractable posterior with a sum-aggregated memory, leveraging the posterior's permutation invariance to retain provably sufficient expressiveness. Compared to prior memory architectures, MATE avoids the growing per-step rollout cost of Transformers and the gradient issues commonly associated with Recurrent Neural Networks (RNNs). Extensive evaluations across diverse benchmarks demonstrate that MATE provides clear computational advantages while achieving performance comparable to standard sequence-model baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
