TL;DR
UTSRMorph is an innovative unsupervised medical image registration network that combines ConvNet and Transformer features with superresolution techniques to produce detailed displacement fields, outperforming existing methods.
Contribution
The paper introduces a fusion attention block and an overlapping cross-attention method, integrating ConvNets and Transformers for improved feature learning in registration tasks.
Findings
Achieves better registration accuracy than state-of-the-art methods.
Effectively captures global dependencies with reduced redundancy.
Produces high-resolution displacement fields with superresolution modules.
Abstract
Complicated image registration is a key issue in medical image analysis, and deep learning-based methods have achieved better results than traditional methods. The methods include ConvNet-based and Transformer-based methods. Although ConvNets can effectively utilize local information to reduce redundancy via small neighborhood convolution, the limited receptive field results in the inability to capture global dependencies. Transformers can establish long-distance dependencies via a self-attention mechanism; however, the intense calculation of the relationships among all tokens leads to high redundancy. We propose a novel unsupervised image registration method named the unified Transformer and superresolution (UTSRMorph) network, which can enhance feature representation learning in the encoder and generate detailed displacement fields in the decoder to overcome these problems. We first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Byte Pair Encoding · Layer Normalization · Residual Connection · Sigmoid Activation · Average Pooling · Multi-Head Attention
