Beyond Augmentation: Cross-Modal Transformer Fusion with Bi-directional Attention for Low-Data Aneurysm Screening

Antara Titikhsha; Divyanshu Tak

arXiv:2512.22185·cs.CV·April 17, 2026

Beyond Augmentation: Cross-Modal Transformer Fusion with Bi-directional Attention for Low-Data Aneurysm Screening

Antara Titikhsha, Divyanshu Tak

PDF

TL;DR

This paper introduces CMTF-Net, a cross-modal transformer model that improves low-data aneurysm screening by encoding vascular anatomy and providing interpretable, localized activation.

Contribution

It presents a novel anatomically structured reasoning framework that enhances aneurysm detection accuracy and interpretability in low-data scenarios.

Findings

01

Achieves near-perfect AUC-ROC in aneurysm screening

02

Maintains high precision despite class imbalance

03

Provides spatially localized activation maps for interpretability

Abstract

Intracranial aneurysm rupture causes subarachnoid hemorrhage with mortality near 50%, making early detection critical. Although CTA enables rapid screening, detecting small aneurysms within the complex three-dimensional branching of the Circle of Willis remains expertise-dependent. Existing automated systems are constrained by class imbalance, skull-base artifacts that mimic vascular contrast, and reliance on global binary classification without structured localization, limiting surgical relevance and interpretability. We propose CMTF-Net, a cross-modal target fusion framework that reframes aneurysm screening as anatomically structured reasoning. By supervising 14 vascular territories independently, the network encodes Circle of Willis geometry while allowing multi-segment activation, aligning model design with clinical workflow. CMTF-Net achieves near-perfect AUC-ROC with narrow…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.