Echo: KV-Cache-Free Associative Recall with Spectral Koopman Operators
Anupama Sridhar, Alexander Johansen

TL;DR
Echo introduces Spectral Koopman Attention, a memory-efficient, cache-free associative recall method that significantly improves retrieval accuracy in long-context reasoning tasks without increasing memory usage.
Contribution
The paper presents a novel Spectral Koopman Attention mechanism enabling constant-memory associative recall, overcoming the memory cliff of state-space models and outperforming traditional attention in long-context benchmarks.
Findings
Achieves 100% retrieval accuracy on Multi-Query Associative Recall benchmark.
Outperforms pure SSM and hybrid models across multiple transfer benchmarks.
Maintains constant inference memory while improving retrieval performance.
Abstract
Long chain-of-thought reasoning and agentic tool-calling produce traces spanning tens of thousands of tokens, yet Transformer KV caches grow linearly with sequence length, creating a memory bottleneck on commodity hardware. State-space models offer constant-memory recurrence but suffer a memory cliff: retrieval accuracy collapses once the gap between a stored fact and its query exceeds the effective horizon of the recurrent state. We introduce Echo, a KV-cache-free associative recall architecture built around Spectral Koopman Attention (SKA); a drop-in replacement for attention layers that augments SSM blocks with a closed-form dynamical operator whose sufficient statistics are accumulated in constant memory with no KV cache. Echo fits a spectral linear system to the key and value history via kernel ridge regression and retrieves through a learned power-iterated filter, all from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
