Investigation of Network Architecture for Multimodal Head-and-Neck Tumor Segmentation
Ye Li, Junyu Chen, Se-in Jang, Kuang Gong, Quanzheng Li

TL;DR
This study compares Transformer-based and convolution-based networks for multimodal head-and-neck tumor segmentation, finding that Transformers excel with large structures while U-Net performs better with small tumors and limited data.
Contribution
It provides a comparative analysis of Transformer-based architectures versus U-Net for head-and-neck tumor segmentation, highlighting their respective strengths and limitations.
Findings
Transformers are beneficial for large structures and extensive fields of view.
U-Net performs well on small tumors with limited training data.
Convolutional networks remain competitive in resource-constrained scenarios.
Abstract
Inspired by the recent success of Transformers for Natural Language Processing and vision Transformer for Computer Vision, many researchers in the medical imaging community have flocked to Transformer-based networks for various main stream medical tasks such as classification, segmentation, and estimation. In this study, we analyze, two recently published Transformer-based network architectures for the task of multimodal head-and-tumor segmentation and compare their performance to the de facto standard 3D segmentation network - the nnU-Net. Our results showed that modeling long-range dependencies may be helpful in cases where large structures are present and/or large field of view is needed. However, for small structures such as head-and-neck tumor, the convolution-based U-Net architecture seemed to perform well, especially when training dataset is small and computational resource is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Head and Neck Cancer Studies · Radiomics and Machine Learning in Medical Imaging
MethodsMulti-Head Attention · Attention Is All You Need · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Concatenated Skip Connection · Linear Layer · Convolution · Byte Pair Encoding · U-Net · Dense Connections
