DXM-TransFuse U-net: Dual Cross-Modal Transformer Fusion U-net for Automated Nerve Identification
Baijun Xie, Gary Milam, Bo Ning, Jaepyeong Cha, Chung Hyuk Park

TL;DR
This paper introduces DXM-TransFuse U-net, a deep learning model that combines multi-modal optical imaging and Transformer-based fusion within a U-Net architecture to improve intraoperative nerve identification accuracy.
Contribution
It proposes a novel dual cross-modal Transformer fusion module integrated into U-Net for enhanced nerve tissue detection from multi-modal imaging data.
Findings
Improved nerve identification accuracy over existing methods
Effective multi-modal feature fusion via Transformer blocks
Potential for noninvasive intraoperative nerve detection
Abstract
Accurate nerve identification is critical during surgical procedures for preventing any damages to nerve tissues. Nerve injuries can lead to long-term detrimental effects for patients as well as financial overburdens. In this study, we develop a deep-learning network framework using the U-Net architecture with a Transformer block based fusion module at the bottleneck to identify nerve tissues from a multi-modal optical imaging system. By leveraging and extracting the feature maps of each modality independently and using each modalities information for cross-modal interactions, we aim to provide a solution that would further increase the effectiveness of the imaging systems for enabling the noninvasive intraoperative nerve identification.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis · Optical Coherence Tomography Applications · Optical Imaging and Spectroscopy Techniques
MethodsMulti-Head Attention · Attention Is All You Need · *Communicated@Fast*How Do I Communicate to Expedia? · Linear Layer · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Absolute Position Encodings · Residual Connection · Layer Normalization · Convolution
