Evoking User Memory: Personalizing LLM via Recollection-Familiarity Adaptive Retrieval
Yingyi Zhang, Junyi Li, Wenlin Zhang, Penyue Jia, Xianneng Li, Yichao Wang, Derong Xu, Yi Wen, Huifeng Guo, Yong Liu, Xiangyu Zhao

TL;DR
This paper introduces RF-Mem, a dual-path memory retrieval system inspired by human cognition, which adaptively switches between familiarity and recollection to improve personalized LLMs efficiently.
Contribution
It proposes a novel familiarity uncertainty-guided dual-path retrieval method that enhances personalization by mimicking human memory processes and adaptively balancing recall quality and efficiency.
Findings
RF-Mem outperforms one-shot retrieval and full-context reasoning.
It maintains high performance under fixed budget and latency constraints.
Experiments validate the effectiveness across multiple benchmarks.
Abstract
Personalized large language models (LLMs) rely on memory retrieval to incorporate user-specific histories, preferences, and contexts. Existing approaches either overload the LLM by feeding all the user's past memory into the prompt, which is costly and unscalable, or simplify retrieval into a one-shot similarity search, which captures only surface matches. Cognitive science, however, shows that human memory operates through a dual process: Familiarity, offering fast but coarse recognition, and Recollection, enabling deliberate, chain-like reconstruction for deeply recovering episodic content. Current systems lack both the ability to perform recollection retrieval and mechanisms to adaptively switch between the dual retrieval paths, leading to either insufficient recall or the inclusion of noise. To address this, we propose RF-Mem (Recollection-Familiarity Memory Retrieval), a…
Peer Reviews
Decision·ICLR 2026 Poster
1. The paper effectively motivates its design using the Recollection–Familiarity Dual-Process Theory (Lines 61–83) and successfully maps the cognitive analogy into a computational retriever. 2. The familiarity-uncertainty gating (Lines 162–201) provides principled decision logic for choosing between retrieval modes, reducing unnecessary expansion while preserving robustness. 3. The adaptive study (Lines 432–454) shows that RF-Mem complements rather than replaces indexing approaches like Memory
1. The α-mix formula (Lines 246–257) is introduced but lacks theoretical clarity on why linear mixing is optimal for query expansion. 2. $\tau$ (Lines 174–201) is fixed globally but may vary significantly across users, domains, or embedding models. 3. Although the paper compares fairly against retrieval-only baselines (Lines 289–303), retrieval today heavily relies on query rewriting, which is absent from the comparisons. 4. The ethical considerations are brief and do not address challenges l
This paper presents a highly original and conceptually grounded contribution by introducing RF-Mem, a dual-process memory retrieval framework inspired by the Recollection–Familiarity theory in cognitive science. The idea of aligning LLM personalization with human memory processes is both novel and intellectually appealing, extending beyond conventional retrieval-augmented generation paradigms. Methodologically, the paper is well executed: the formulation of familiarity uncertainty through mean s
Despite its conceptual elegance, the paper’s empirical validation should be further strengthened. First, the experimental comparisons are restricted to retrieval-only baselines, omitting stronger LLM-based retrieval or reasoning systems such as query rewriting, iterative retrieval (e.g., Search-Mem), or graph-based methods. Including such more advanced baselines would better position RF-Mem within current state-of-the-art approaches and clarify whether its advantages hold beyond standard dense
1. The cognitive grounding is crisp and the gate is concrete (mean score + entropy with a sharpness λ), so the controller is easy to reason about. 2. Recollection is lightweight and model-agnostic since it lives entirely in vector space using simple clustering and linear mixing. 3. Compute stays bounded because the loop exposes explicit knobs for beam width, fanout, and depth instead of open-ended expansion.
1. The many thresholds and weights feel hand-tuned, with little guidance for auto-tuning across users and domains. 2. KMeans centroids can be unstable on anisotropic or overlapping memory clusters, so α-mixing may drift the query off-intention. 3. Entropy over the top-K score simplex depends on λ and retriever scale, which could make the gate brittle when swapping encoders or normalizations. 4. The controller assumes unit-normalized cosine and a sub-Gaussian mean-similarity story, which ma
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education
