PHTrans: Parallelly Aggregating Global and Local Representations for Medical Image Segmentation
Wentao Liu, Tong Tian, Weijin Xu, Huihua Yang, Xipeng Pan, Songlin, Yan, Lemeng Wang

TL;DR
PHTrans introduces a parallel hybrid architecture combining CNNs and Transformers for medical image segmentation, effectively capturing global and local features to improve accuracy over existing methods.
Contribution
The paper proposes a novel parallel hybrid architecture, PHTrans, that simultaneously learns and aggregates global and local features in medical image segmentation.
Findings
Consistently outperforms state-of-the-art methods on multiple datasets.
Effective integration of global and local features improves segmentation accuracy.
Validated on Multi-Atlas Labeling and Cardiac Diagnosis datasets.
Abstract
The success of Transformer in computer vision has attracted increasing attention in the medical imaging community. Especially for medical image segmentation, many excellent hybrid architectures based on convolutional neural networks (CNNs) and Transformer have been presented and achieve impressive performance. However, most of these methods, which embed modular Transformer into CNNs, struggle to reach their full potential. In this paper, we propose a novel hybrid architecture for medical image segmentation called PHTrans, which parallelly hybridizes Transformer and CNN in main building blocks to produce hierarchical representations from global and local features and adaptively aggregate them, aiming to fully exploit their strengths to obtain better segmentation performance. Specifically, PHTrans follows the U-shaped encoder-decoder design and introduces the parallel hybird module in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Medical Imaging and Analysis · Radiomics and Machine Learning in Medical Imaging
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Label Smoothing · Dropout · Softmax · Byte Pair Encoding
