Sessa: Selective State Space Attention

Liubomyr Horbatko

arXiv:2604.18580·cs.LG·April 22, 2026

Sessa: Selective State Space Attention

Liubomyr Horbatko

PDF

1 Repo

TL;DR

Sessa introduces a novel decoder architecture with attention inside a recurrent feedback loop, enabling better long-range information retention and selective retrieval compared to traditional Transformers and state-space models.

Contribution

The paper presents Sessa, a new model that combines attention and recurrence, achieving power-law memory decay and flexible long-range information retrieval.

Findings

01

Sessa achieves power-law memory tails with decay rate $O( ext{ell}^{-eta})$

02

Sessa outperforms baselines on long-context benchmarks

03

Sessa maintains competitive performance on short-context tasks

Abstract

Modern sequence modeling is dominated by two families: Transformers, whose self-attention can access arbitrary elements of the visible sequence, and structured state-space models, which propagate information through an explicit recurrent state. These mechanisms face different limitations on long contexts: when attention is diffuse, the influence of individual tokens is diluted across the effective support, while recurrent state propagation can lose long-range sensitivity unless information is actively preserved. As a result, both mechanisms face challenges in preserving and selectively retrieving information over long contexts. We propose Sessa, a decoder that places attention inside a recurrent feedback path. This creates many attention-based paths through which past tokens can influence future states, rather than relying on a single attention read or a single recurrent chain. We prove…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

libratioai/sessa
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.