SegTransVAE: Hybrid CNN -- Transformer with Regularization for medical image segmentation
Quan-Dung Pham (1), Hai Nguyen-Truong (1, 2, 3), Nam Nguyen Phuong, (1), Khoa N. A. Nguyen (1, 2, 3) ((1) VinBrain JSC., Vietnam, (2), University of Science, Ho Chi Minh City, Vietnam, (3) Vietnam National, University, Ho Chi Minh City, Vietnam)

TL;DR
SegTransVAE is a novel hybrid deep learning model combining CNN, transformer, and VAE for improved medical image segmentation, demonstrating superior accuracy and comparable inference speed.
Contribution
This paper introduces the first method integrating CNN, transformer, and VAE for medical image segmentation, enhancing global and local feature learning.
Findings
Outperforms previous methods in Dice Score
Achieves better 95% Hausdorff Distance
Maintains inference time comparable to CNNs
Abstract
Current research on deep learning for medical image segmentation exposes their limitations in learning either global semantic information or local contextual information. To tackle these issues, a novel network named SegTransVAE is proposed in this paper. SegTransVAE is built upon encoder-decoder architecture, exploiting transformer with the variational autoencoder (VAE) branch to the network to reconstruct the input images jointly with segmentation. To the best of our knowledge, this is the first method combining the success of CNN, transformer, and VAE. Evaluation on various recently introduced datasets shows that SegTransVAE outperforms previous methods in Dice Score and -Haudorff Distance while having comparable inference time to a simple CNN-based architecture network. The source code is available at: https://github.com/itruonghai/SegTransVAE.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Radiomics and Machine Learning in Medical Imaging · Medical Imaging and Analysis
