Surface Vision Mamba: Leveraging Bidirectional State Space Model for   Efficient Spherical Manifold Representation

Rongzhao He; Weihao Zheng; Leilei Zhao; Ying Wang; Dalin Zhu; Dan Wu,; Bin Hu

arXiv:2501.14679·cs.CV·February 21, 2025

Surface Vision Mamba: Leveraging Bidirectional State Space Model for Efficient Spherical Manifold Representation

Rongzhao He, Weihao Zheng, Leilei Zhao, Ying Wang, Dalin Zhu, Dan Wu,, Bin Hu

PDF

Open Access

TL;DR

Surface Vision Mamba introduces an efficient, attention-free model for analyzing spherical cortical surface data, significantly reducing inference time and memory usage while outperforming existing attention-based methods in neurodevelopmental tasks.

Contribution

The paper presents Surface Vision Mamba, a novel bidirectional state space model for spherical data that is domain-agnostic and more efficient than attention-based models.

Findings

01

Achieves 4.8x faster inference than attention-based models.

02

Uses 91.7% less memory compared to Surface Vision Transformer.

03

Outperforms existing methods in neurodevelopmental phenotype regression.

Abstract

Attention-based methods have demonstrated exceptional performance in modelling long-range dependencies on spherical cortical surfaces, surpassing traditional Geometric Deep Learning (GDL) models. However, their extensive inference time and high memory demands pose challenges for application to large datasets with limited computing resources. Inspired by the state space model in computer vision, we introduce the attention-free Vision Mamba (Vim) to spherical surfaces, presenting a domain-agnostic architecture for analyzing data on spherical manifolds. Our method achieves surface patching by representing spherical data as a sequence of triangular patches derived from a subdivided icosphere. The proposed Surface Vision Mamba (SiM) is evaluated on multiple neurodevelopmental phenotype regression tasks using cortical surface metrics from neonatal brains. Experimental results demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction · Image and Object Detection Techniques · 3D Shape Modeling and Analysis

MethodsAttention Is All You Need · Softmax · Residual Connection · Dropout · Absolute Position Encodings · Byte Pair Encoding · Linear Layer · Vision Transformer · Multi-Head Attention · Position-Wise Feed-Forward Layer