Beyond Sliding Windows: Learning to Manage Memory in Non-Markovian Environments
Geraud Nangue Tasse, Matthew Riemer, Benjamin Rosman, Tim Klinger

TL;DR
This paper introduces an adaptive memory management meta-algorithm for sequence models in non-Markovian environments, reducing computational and memory costs while maintaining performance.
Contribution
It proposes a novel meta-algorithm called Adaptive Stacking that adaptively manages memory stacks, with theoretical guarantees and practical efficiency improvements.
Findings
Adaptive Stacking reduces memory and computation in non-Markovian tasks.
The method maintains performance by removing non-predictive memories.
Experiments validate efficiency gains across different model architectures.
Abstract
Recent success in developing increasingly general purpose agents based on sequence models has led to increased focus on the problem of deploying computationally limited agents within the vastly more complex real-world. A key challenge experienced in these more realistic domains is highly non-Markovian dependencies with respect to the agent's observations, which are less common in small controlled domains. The predominant approach for dealing with this in the literature is to stack together a window of the most recent observations (Frame Stacking), but this window size must grow with the degree of non-Markovian dependencies, which results in prohibitive computational and memory requirements for both action inference and learning. In this paper, we are motivated by the insight that in many environments that are highly non-Markovian with respect to time, the environment only causally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics · Generative Adversarial Networks and Image Synthesis
