SwinCross: Cross-modal Swin Transformer for Head-and-Neck Tumor Segmentation in PET/CT Images
Gary Y. Li, Junyu Chen, Se-In Jang, Kuang Gong, and Quanzheng Li

TL;DR
SwinCross is a novel cross-modal transformer model that effectively combines PET and CT images for improved head-and-neck tumor segmentation, outperforming existing methods in accuracy and feature representation.
Contribution
The paper introduces SwinCross, a cross-modal Swin Transformer with a new attention module for enhanced multi-resolution feature extraction in medical image segmentation.
Findings
Outperforms nnU-Net and other transformer-based methods on HECKTOR 2021 dataset.
Effectively captures inter-modality features between PET and CT.
Demonstrates superior segmentation accuracy for head-and-neck tumors.
Abstract
Radiotherapy (RT) combined with cetuximab is the standard treatment for patients with inoperable head and neck cancers. Segmentation of head and neck (H&N) tumors is a prerequisite for radiotherapy planning but a time-consuming process. In recent years, deep convolutional neural networks have become the de facto standard for automated image segmentation. However, due to the expensive computational cost associated with enlarging the field of view in DCNNs, their ability to model long-range dependency is still limited, and this can result in sub-optimal segmentation performance for objects with background context spanning over long distances. On the other hand, Transformer models have demonstrated excellent capabilities in capturing such long-range information in several semantic segmentation tasks performed on medical images. Inspired by the recent success of Vision Transformers and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHead and Neck Cancer Studies · Radiomics and Machine Learning in Medical Imaging · Advanced Radiotherapy Techniques
MethodsAttention Is All You Need · *Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · Max Pooling · U-Net · Linear Layer · Layer Normalization · Stochastic Depth · Multi-Head Attention · Position-Wise Feed-Forward Layer
