Steerable Transformers for Volumetric Data

Soumyabrata Kundu; Risi Kondor

arXiv:2405.15932·cs.CV·October 28, 2025

Steerable Transformers for Volumetric Data

Soumyabrata Kundu, Risi Kondor

PDF

Open Access

TL;DR

This paper presents Steerable Transformers, an extension of Vision Transformers that maintains SE(d) equivariance, improving performance on volumetric data by integrating steerable convolutions and Fourier space nonlinearities.

Contribution

It introduces an SE(d)-equivariant attention mechanism using steerable convolutions and Fourier space nonlinearities, enhancing volumetric data processing.

Findings

01

Enhanced performance on 2D and 3D datasets.

02

SE(d) equivariance improves model robustness.

03

Fourier space nonlinearities contribute to better feature learning.

Abstract

We introduce Steerable Transformers, an extension of the Vision Transformer mechanism that maintains equivariance to the special Euclidean group $SE (d)$ . We propose an equivariant attention mechanism that operates on features extracted by steerable convolutions. Operating in Fourier space, our network utilizes Fourier space non-linearities. Our experiments in both two and three dimensions show that adding steerable transformer layers to steerable convolutional networks enhances performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMaterial Properties and Processing

MethodsAttention Is All You Need · Byte Pair Encoding · Label Smoothing · Adam · Position-Wise Feed-Forward Layer · Dropout · Dense Connections · Absolute Position Encodings · Softmax · Layer Normalization