Trained Persistent Memory for Frozen Encoder--Decoder LLMs: Six Architectural Methods

Hong Jeong

arXiv:2603.16413·cs.LG·March 18, 2026

Trained Persistent Memory for Frozen Encoder--Decoder LLMs: Six Architectural Methods

Hong Jeong

PDF

Open Access

TL;DR

This paper demonstrates the feasibility of persistent memory in frozen encoder-decoder LLMs through six architectural methods, enabling conversational learning with minimal resources and no additional training of the core model.

Contribution

It introduces six novel architectural methods for embedding persistent memory into frozen LLMs, facilitating continual learning without retraining the entire model.

Findings

01

Memory capacity is critical for effective recall.

02

All six methods outperform the baseline at higher capacity.

03

Memory can be scaled independently of the backbone model.

Abstract

Frozen encoder--decoder language models are stateless: the latent representation is discarded after every forward pass, so no information persists across sessions. This paper presents a \textbf{proof-of-concept pilot study} showing that persistent memory in the \emph{continuous latent space} of a frozen LLM is feasible -- even under severe resource constraints (a single frozen Flan-T5-XL backbone, small trainable adapters, a single dataset). We implement six architectural methods spanning three injection points and four write mechanisms; unlike text-level memory systems, every write and read is a differentiable operation on dense vectors. After training only the adapter, the memory bank continues to accumulate at inference time without gradients, enabling \emph{conversational learning}. Under a forgetting-curve evaluation on LoCoMo at two capacity scales (1 $\times$ and 10 $\times$ ), the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Topic Modeling · Natural Language Processing Techniques