Learning from Supervision with Semantic and Episodic Memory: A Reflective Approach to Agent Adaptation
Jackson Hassell, Dan Zhang, Hannah Kim, Tom Mitchell, Estevam Hruschka

TL;DR
This paper presents a memory-augmented framework for LLMs to learn from labeled data without fine-tuning, using critiques stored in episodic and semantic memory to improve accuracy and efficiency.
Contribution
It introduces a novel memory-based critique mechanism for LLMs, demonstrating significant accuracy improvements and inference-time efficiency gains across diverse tasks.
Findings
Best self-critique strategy improves accuracy by 8.1 percentage points over zero shot.
Memory critiques reduce reasoning tokens by an average of 31.95%.
Suggestibility metric explains variability in memory augmentation effectiveness.
Abstract
We investigate how agents built on pretrained large language models (LLMs) can learn target classification functions from labeled examples without parameter updates. While conventional approaches like fine-tuning are often costly, inflexible, and opaque, we propose a memory-augmented framework that leverages LLM-generated critiques grounded in labeled data. Our framework uses episodic memory to store instance-level critiques - capturing specific past experiences - and semantic memory to distill these into reusable, task-level guidance. Across a diverse set of tasks and models, our best performing self-critique strategy (utilizing both memory types) yields an average improvement of 8.1 percentage points over the zero shot baseline, and 4.6pp over a RAG-based baseline that relies only on labels. However, improvements vary substantially across models and domains. To explain this variation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
