CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation
Tao Lei, Rui Sun, Xuan Wang, Yingbo Wang, Xi He, Asoke Nandi

TL;DR
CiT-Net is a novel hybrid model combining CNNs with vision Transformers, featuring dynamic deformable convolution and a shifted-window attention module, achieving superior medical image segmentation with lower computational costs.
Contribution
The paper introduces a hybrid CNN-Transformer architecture with dynamic deformable convolution and a shifted-window attention module, enhancing feature learning and segmentation accuracy without pre-training.
Findings
Outperforms state-of-the-art methods in medical image segmentation
Requires fewer parameters and less computational cost
Does not rely on pre-training
Abstract
The hybrid architecture of convolutional neural networks (CNNs) and Transformer are very popular for medical image segmentation. However, it suffers from two challenges. First, although a CNNs branch can capture the local image features using vanilla convolution, it cannot achieve adaptive feature learning. Second, although a Transformer branch can capture the global features, it ignores the channel and cross-dimensional self-attention, resulting in a low segmentation accuracy on complex-content images. To address these challenges, we propose a novel hybrid architecture of convolutional neural networks hand in hand with vision Transformers (CiT-Net) for medical image segmentation. Our network has two advantages. First, we design a dynamic deformable convolution and apply it to the CNNs branch, which overcomes the weak feature extraction ability due to fixed-size convolution kernels and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · COVID-19 diagnosis using AI · Medical Image Segmentation Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Residual Connection · Linear Layer · Dropout · Label Smoothing · Adam · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Layer Normalization
