Steerable Transformers for Volumetric Data
Soumyabrata Kundu, Risi Kondor

TL;DR
This paper presents Steerable Transformers, an extension of Vision Transformers that maintains SE(d) equivariance, improving performance on volumetric data by integrating steerable convolutions and Fourier space nonlinearities.
Contribution
It introduces an SE(d)-equivariant attention mechanism using steerable convolutions and Fourier space nonlinearities, enhancing volumetric data processing.
Findings
Enhanced performance on 2D and 3D datasets.
SE(d) equivariance improves model robustness.
Fourier space nonlinearities contribute to better feature learning.
Abstract
We introduce Steerable Transformers, an extension of the Vision Transformer mechanism that maintains equivariance to the special Euclidean group . We propose an equivariant attention mechanism that operates on features extracted by steerable convolutions. Operating in Fourier space, our network utilizes Fourier space non-linearities. Our experiments in both two and three dimensions show that adding steerable transformer layers to steerable convolutional networks enhances performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMaterial Properties and Processing
MethodsAttention Is All You Need · Byte Pair Encoding · Label Smoothing · Adam · Position-Wise Feed-Forward Layer · Dropout · Dense Connections · Absolute Position Encodings · Softmax · Layer Normalization
