ImplicitMemBench: Measuring Unconscious Behavioral Adaptation in Large Language Models
Chonghan Qin, Xiachong Feng, Weitao Ma, Xiaocheng Feng, Lingpeng Kong

TL;DR
ImplicitMemBench introduces a novel benchmark to evaluate unconscious, automatic behaviors in large language models, revealing significant limitations and bottlenecks in implicit memory capabilities.
Contribution
This work pioneers a systematic evaluation of implicit memory in LLMs using cognitively grounded constructs, highlighting current models' deficiencies.
Findings
No model exceeds 66% overall performance.
Top models are far below human baseline.
Universal bottlenecks suggest need for architectural innovations.
Abstract
Existing memory benchmarks for LLM agents evaluate explicit recall of facts, yet overlook implicit memory where experience becomes automated behavior without conscious retrieval. This gap is critical: effective assistants must automatically apply learned procedures or avoid failed actions without explicit reminders. We introduce ImplicitMemBench, the first systematic benchmark evaluating implicit memory through three cognitively grounded constructs drawn from standard cognitive-science accounts of non-declarative memory: Procedural Memory (one-shot skill acquisition after interference), Priming (theme-driven bias via paired experimental/control instances), and Classical Conditioning (Conditioned Stimulus--Unconditioned Stimulus (CS--US) associations shaping first decisions). Our 300-item suite employs a unified Learning/Priming-Interfere-Test protocol with first-attempt scoring.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
