Emerging Human-like Strategies for Semantic Memory Foraging in Large Language Models
Eric Lacosse, Mariana Duarte, Peter M. Todd, Daniel C. McNamee

TL;DR
This paper investigates how large language models exhibit human-like semantic memory foraging strategies during the Semantic Fluency Task, using interpretability techniques to analyze emergent search behaviors across model layers.
Contribution
It introduces a novel application of mechanistic interpretability to study semantic memory foraging in LLMs through the Semantic Fluency Task, revealing human-like strategic patterns.
Findings
Emergence of human-like convergent and divergent search patterns in LLMs.
Identification of these patterns across different model layers.
Insights into aligning LLM behavior with human cognitive strategies.
Abstract
Both humans and Large Language Models (LLMs) store a vast repository of semantic memories. In humans, efficient and strategic access to this memory store is a critical foundation for a variety of cognitive functions. Such access has long been a focus of psychology and the computational mechanisms behind it are now well characterized. Much of this understanding has been gleaned from a widely-used neuropsychological and cognitive science assessment called the Semantic Fluency Task (SFT), which requires the generation of as many semantically constrained concepts as possible. Our goal is to apply mechanistic interpretability techniques to bring greater rigor to the study of semantic memory foraging in LLMs. To this end, we present preliminary results examining SFT as a case study. A central focus is on convergent and divergent patterns of generative memory search, which in humans play…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeurobiology of Language and Bilingualism · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)
