TL;DR
UNETVL is a novel 3D medical image segmentation architecture that combines Vision-LSTM and Chebyshev KAN to improve long-range dependency modeling and computational efficiency, outperforming previous methods.
Contribution
The paper introduces UNETVL, integrating Vision-LSTM and Chebyshev KAN, to enhance 3D medical image segmentation with better scalability and accuracy.
Findings
Achieved 7.3% higher Dice score on ACDC dataset.
Achieved 15.6% higher Dice score on AMOS2022 dataset.
Demonstrated the effectiveness of each component through ablation studies.
Abstract
3D medical image segmentation has progressed considerably due to Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), yet these methods struggle to balance long-range dependency acquisition with computational efficiency. To address this challenge, we propose UNETVL (U-Net Vision-LSTM), a novel architecture that leverages recent advancements in temporal information processing. UNETVL incorporates Vision-LSTM (ViL) for improved scalability and memory functions, alongside an efficient Chebyshev Kolmogorov-Arnold Networks (KAN) to handle complex and long-range dependency patterns more effectively. We validated our method on the ACDC and AMOS2022 (post challenge Task 2) benchmark datasets, showing a significant improvement in mean Dice score compared to recent state-of-the-art approaches, especially over its predecessor, UNETR, with increases of 7.3% on ACDC and 15.6% on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Dense Connections · Batch Normalization · Concatenated Skip Connection · Residual Connection · Softmax · Linear Layer · Attention Is All You Need · U-Net
