Deep sequence models tend to memorize geometrically; it is unclear why
Shahriar Noroozizadeh, Vaishnavh Nagarajan, Elan Rosenfeld, Sanjiv Kumar

TL;DR
Deep sequence models develop a form of geometric memory that encodes global relationships, enabling complex reasoning tasks to be simplified into easy navigation, which challenges traditional associative memory views.
Contribution
The paper introduces the concept of geometric memory in deep sequence models, contrasting it with associative memory, and analyzes its origins and implications for neural embedding geometries.
Findings
Models encode global relationships as geometric memory.
Geometric memory simplifies complex reasoning into navigation tasks.
Spectral bias contributes to the emergence of geometric memory.
Abstract
Deep sequence models are said to store atomic facts predominantly in the form of associative memory: a brute-force lookup of co-occurring entities. We identify a dramatically different form of storage of atomic facts that we term as geometric memory. Here, the model has synthesized embeddings encoding novel global relationships between all entities, including ones that do not co-occur in training. Such storage is powerful: for instance, we show how it transforms a hard reasoning task involving an -fold composition into an easy-to-learn -step navigation task. From this phenomenon, we extract fundamental aspects of neural embedding geometries that are hard to explain. We argue that the rise of such a geometry, as against a lookup of local associations, cannot be straightforwardly attributed to typical supervisory, architectural, or optimizational pressures. Counterintuitively,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
