Breaking Quadratic Barriers: A Non-Attention LLM for Ultra-Long Context Horizons
Andrew Kiruluta, Preethi Raju, Priscilla Burity

TL;DR
This paper introduces a non-attention architecture for large language models capable of processing extremely long contexts efficiently, overcoming quadratic complexity limitations of traditional Transformers.
Contribution
It proposes a novel combination of state space, multi-resolution convolution, recurrent supervision, and external memory to enable ultra-long context handling without attention.
Findings
Handles hundreds of thousands to millions of tokens efficiently
Avoids quadratic memory and computation bottlenecks
Maintains global context with lightweight components
Abstract
We present a novel non attention based architecture for large language models (LLMs) that efficiently handles very long context windows, on the order of hundreds of thousands to potentially millions of tokens. Unlike traditional Transformer designs, which suffer from quadratic memory and computation overload due to the nature of the self attention mechanism, our model avoids token to token attention entirely. Instead, it combines the following complementary components: State Space blocks (inspired by S4) that learn continuous time convolution kernels and scale near linearly with sequence length, Multi Resolution Convolution layers that capture local context at different dilation levels, a lightweight Recurrent Supervisor to maintain a global hidden state across sequential chunks, and Retrieval Augmented External Memory that stores and retrieves high-level chunk embeddings without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Image and Object Detection Techniques · Advanced Neural Network Applications
