TL;DR
HNOSeg-XS introduces a resolution-robust, efficient neural operator for 3D medical image segmentation, outperforming CNNs and transformers in speed, memory, and parameter efficiency across multiple datasets.
Contribution
The paper presents HNOSeg-XS, a novel neural operator model that achieves resolution robustness and efficiency by reformulating segmentation in the frequency domain using Hartley transforms.
Findings
Outperforms CNNs and transformers in inference speed and memory usage.
Uses fewer than 35,000 parameters, demonstrating high efficiency.
Achieves superior resolution robustness across multiple datasets.
Abstract
In medical image segmentation, convolutional neural networks (CNNs) and transformers are dominant. For CNNs, given the local receptive fields of convolutional layers, long-range spatial correlations are captured through consecutive convolutions and pooling. However, as the computational cost and memory footprint can be prohibitively large, 3D models can only afford fewer layers than 2D models with reduced receptive fields and abstract levels. For transformers, although long-range correlations can be captured by multi-head attention, its quadratic complexity with respect to input size is computationally demanding. Therefore, either model may require input size reduction to allow more filters and layers for better segmentation. Nevertheless, given their discrete nature, models trained with patch-wise training or image downsampling may produce suboptimal results when applied on higher…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
