GPAFormer: Graph-guided Patch Aggregation Transformer for Efficient 3D Medical Image Segmentation
Chung-Ming Lo, I-Yun Liu, Wei-Yang Lin

TL;DR
GPAFormer is a lightweight 3D medical image segmentation network that balances high accuracy with computational efficiency, suitable for resource-limited clinical settings.
Contribution
The paper introduces GPAFormer, a novel architecture with multi-scale attention and graph-guided patch aggregation modules for efficient multi-organ segmentation.
Findings
Achieved highest DSC scores on multiple datasets with only 1.81 M parameters.
Inference time per case was less than one second on a consumer GPU.
Balanced accuracy and efficiency across various clinical scenarios.
Abstract
Deep learning has been widely applied to 3D medical image segmentation tasks. However, due to the diversity of imaging modalities, the high-dimensional nature of the data, and the heterogeneity of anatomical structures, achieving both segmentation accuracy and computational efficiency in multi-organ segmentation remains a challenge. This study proposed GPAFormer, a lightweight network architecture specifically designed for 3D medical image segmentation, emphasizing efficiency while keeping high accuracy. GPAFormer incorporated two core modules: the multi-scale attention-guided stacked aggregation (MASA) and the mutual-aware patch graph aggregator (MPGA). MASA utilized three parallel paths with different receptive fields, combined through planar aggregation, to enhance the network's capability in handling structures of varying sizes. MPGA employed a graph-guided approach to dynamically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
