SW-ViT: A Spatio-Temporal Vision Transformer Network with Post Denoiser for Sequential Multi-Push Ultrasound Shear Wave Elastography
Ahsan Habib Akash, MD Jahin Alam, Md. Kamrul Hasan

TL;DR
SW-ViT introduces a two-stage deep learning framework combining spatio-temporal transformers and denoising networks to improve ultrasound shear wave elastography accuracy, noise robustness, and segmentation capability.
Contribution
The paper presents a novel two-stage deep learning approach integrating CNN and transformer architectures for SWE, addressing noise, data scarcity, and segmentation simultaneously.
Findings
Achieves high PSNR, CNR, and SSIM on simulated data.
Attains high IoU and low ASSD in phantom segmentation.
Demonstrates robustness to noisy SWE data.
Abstract
Objective: Ultrasound Shear Wave Elastography (SWE) demonstrates great potential in assessing soft-tissue pathology by mapping tissue stiffness, which is linked to malignancy. Traditional SWE methods have shown promise in estimating tissue elasticity, yet their susceptibility to noise interference, reliance on limited training data, and inability to generate segmentation masks concurrently present notable challenges to accuracy and reliability. Approach: In this paper, we propose SW-ViT, a novel two-stage deep learning framework for SWE that integrates a CNN-Spatio-Temporal Vision Transformer-based reconstruction network with an efficient Transformer-based post-denoising network. The first stage uses a 3D ResNet encoder with multi-resolution spatio-temporal Transformer blocks that capture spatial and temporal features, followed by a squeeze-and-excitation attention decoder that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhotoacoustic and Ultrasonic Imaging · Ultrasound Imaging and Elastography · Infrared Thermography in Medicine
MethodsAttention Is All You Need · Average Pooling · Convolution · Global Average Pooling · Linear Layer · Layer Normalization · Byte Pair Encoding · Residual Connection · Kaiming Initialization · Dense Connections
