MambaEye: A Size-Agnostic Visual Encoder with Causal Sequential Processing
Changho Choi, Minho Kim, Jinkyu Kim

TL;DR
MambaEye is a size-agnostic, causal visual encoder that uses a unidirectional approach and relative move embedding to efficiently process images of arbitrary resolutions, demonstrating strong performance on high-resolution image classification.
Contribution
It introduces MambaEye, a novel size-agnostic, causal sequential encoder with a diffusion-inspired loss, enabling efficient high-resolution image processing.
Findings
Robust performance on high-resolution ImageNet-1K classification.
Maintains linear time and memory complexity with respect to input size.
Effective across a wide range of image resolutions.
Abstract
Despite decades of progress, a truly input-size agnostic visual encoder-a fundamental characteristic of human vision-has remained elusive. We address this limitation by proposing \textbf{MambaEye}, a novel, causal sequential encoder that leverages the low complexity and causal-process based pure Mamba2 backbone. Unlike previous Mamba-based vision encoders that often employ bidirectional processing, our strictly unidirectional approach preserves the inherent causality of State Space Models, enabling the model to generate a prediction at any point in its input sequence. A core innovation is our use of relative move embedding, which encodes the spatial shift between consecutive patches, providing a strong inductive bias for translation invariance and making the model inherently adaptable to arbitrary image resolutions and scanning patterns. To achieve this, we introduce a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Advanced Memory and Neural Computing
