Random Token Fusion for Multi-View Medical Diagnosis
Jingyu Guo, Christos Matsoukas, Fredrik Strand, Kevin Smith

TL;DR
This paper introduces Random Token Fusion (RTF), a novel training technique for vision transformers that improves multi-view medical diagnosis accuracy and robustness by reducing overfitting without extra inference cost.
Contribution
The paper proposes RTF, a new method that incorporates randomness into feature fusion during training to enhance multi-view medical image analysis.
Findings
RTF improves diagnostic accuracy across multiple datasets.
RTF enhances model robustness against overfitting.
RTF outperforms existing fusion methods in experiments.
Abstract
In multi-view medical diagnosis, deep learning-based models often fuse information from different imaging perspectives to improve diagnostic performance. However, existing approaches are prone to overfitting and rely heavily on view-specific features, which can lead to trivial solutions. In this work, we introduce Random Token Fusion (RTF), a novel technique designed to enhance multi-view medical image analysis using vision transformers. By integrating randomness into the feature fusion process during training, RTF addresses the issue of overfitting and enhances the robustness and accuracy of diagnostic models without incurring any additional cost at inference. We validate our approach on standard mammography and chest X-ray benchmark datasets. Through extensive experiments, we demonstrate that RTF consistently improves the performance of existing fusion methods, paving the way for a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification
