ModeT: Learning Deformable Image Registration via Motion Decomposition Transformer
Haiqiao Wang, Dong Ni, and Yi Wang

TL;DR
ModeT introduces a novel Transformer-based approach for medical image registration that explicitly models multiple deformation modes and outperforms existing methods on brain MRI datasets.
Contribution
The paper proposes ModeT, a Transformer architecture that explicitly decomposes and models multiple motion modalities for improved deformable image registration.
Findings
Outperforms state-of-the-art registration networks
Effectively models multiple motion modes
Demonstrates superior performance on brain MRI datasets
Abstract
The Transformer structures have been widely used in computer vision and have recently made an impact in the area of medical image registration. However, the use of Transformer in most registration networks is straightforward. These networks often merely use the attention mechanism to boost the feature learning as the segmentation networks do, but do not sufficiently design to be adapted for the registration task. In this paper, we propose a novel motion decomposition Transformer (ModeT) to explicitly model multiple motion modalities by fully exploiting the intrinsic capability of the Transformer structure for deformation estimation. The proposed ModeT naturally transforms the multi-head neighborhood attention relationship into the multi-coordinate relationship to model multiple motion modes. Then the competitive weighting module (CWM) fuses multiple deformation sub-fields to generate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis · Medical Image Segmentation Techniques · Advanced Neuroimaging Techniques and Applications
MethodsMulti-Head Attention · Attention Is All You Need · Dropout · Label Smoothing · Dense Connections · Linear Layer · Position-Wise Feed-Forward Layer · Layer Normalization · Byte Pair Encoding · Absolute Position Encodings
