Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment
Danqing Ma, Meng Wang, Ao Xiang, Zongqing Qi, Qin Yang

TL;DR
This paper introduces Multitrans, a Transformer-based multi-modal fusion framework that combines CT images and clinical reports to improve the prediction of stroke treatment outcomes, demonstrating the benefits of multi-modal data integration.
Contribution
The study presents a novel multi-modal Transformer architecture that effectively combines imaging and clinical data for stroke outcome prediction, outperforming single-modality models.
Findings
Multi-modal fusion improves prediction accuracy over single modalities.
Text data alone outperforms image data in classification tasks.
Combining imaging with clinical reports enhances predictive performance.
Abstract
This study proposes a multi-modal fusion framework Multitrans based on the Transformer architecture and self-attention mechanism. This architecture combines the study of non-contrast computed tomography (NCCT) images and discharge diagnosis reports of patients undergoing stroke treatment, using a variety of methods based on Transformer architecture approach to predicting functional outcomes of stroke treatment. The results show that the performance of single-modal text classification is significantly better than single-modal image classification, but the effect of multi-modal combination is better than any single modality. Although the Transformer model only performs worse on imaging data, when combined with clinical meta-diagnostic information, both can learn better complementary information and make good contributions to accurately predicting stroke treatment effects..
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAcute Ischemic Stroke Management
MethodsAttention Is All You Need · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Absolute Position Encodings · Dropout · Dense Connections · Label Smoothing · Residual Connection · Softmax · Adam
