Learning Memory Mechanisms for Decision Making through Demonstrations
William Yue, Bo Liu, Peter Stone

TL;DR
This paper introduces AttentionTuner, a method that incorporates memory dependency pairs into Transformers to improve decision-making in partially observable environments, demonstrating significant performance gains.
Contribution
The paper proposes a novel approach to explicitly model memory dependencies in decision-making using memory dependency pairs and AttentionTuner, enhancing Transformer performance.
Findings
Significant improvements on Memory Gym and Long-term Memory Benchmark
Effective modeling of memory dependencies in decision processes
Open-source code available for reproducibility
Abstract
In Partially Observable Markov Decision Processes, integrating an agent's history into memory poses a significant challenge for decision-making. Traditional imitation learning, relying on observation-action pairs for expert demonstrations, fails to capture the expert's memory mechanisms used in decision-making. To capture memory processes as demonstrations, we introduce the concept of memory dependency pairs indicating that events at time are recalled for decision-making at time . We introduce AttentionTuner to leverage memory dependency pairs in Transformers and find significant improvements across several tasks compared to standard Transformers when evaluated on Memory Gym and the Long-term Memory Benchmark. Code is available at https://github.com/WilliamYue37/AttentionTuner.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Topic Modeling · Metaheuristic Optimization Algorithms Research
