TCP-SSM: Efficient Vision State Space Models with Token-Conditioned Poles

Sara Shoouri; Morteza Tavakoli Taba; Hun-Seok Kim

arXiv:2605.11563·cs.CV·May 13, 2026

TCP-SSM: Efficient Vision State Space Models with Token-Conditioned Poles

Sara Shoouri, Morteza Tavakoli Taba, Hun-Seok Kim

PDF

TL;DR

TCP-SSM introduces a structured, interpretable state space model with token-conditioned poles, enhancing efficiency and adaptability in vision tasks while maintaining high accuracy.

Contribution

It proposes a novel Token-Conditioned Poles SSM framework that explicitly models recurrence dynamics with stable poles and token-dependent adaptation for vision tasks.

Findings

01

Reduces SSM computation complexity up to 44%.

02

Maintains or surpasses baseline accuracy across vision benchmarks.

03

Provides interpretable recurrence dynamics through stable poles.

Abstract

State Space Models (SSMs) have emerged as a compelling alternative to attention models for long-range vision tasks, offering input-dependent recurrence with linear complexity. However, most efficient SSM variants reduce computation cost by modifying scan routes, resolutions, or traversal patterns, while largely leaving the recurrent dynamics implicit. Consequently, the model's state-dependent memory behavior is difficult to control, particularly in compact backbones where long scan paths can exceed the effective memory horizon. We propose Token-Conditioned Poles SSM (TCP-SSM), a structured selective SSM framework that improves efficiency while making recurrence dynamics explicit and interpretable through stable poles. TCP-SSM builds each scan operator with 1) real poles that model monotone or sign-alternating decay, and 2) complex-conjugate poles that capture damped oscillatory…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.