Eye-gaze Guided Multi-modal Alignment for Medical Representation Learning
Chong Ma, Hanqi Jiang, Wenting Chen, Yiwei Li, Zihao Wu, Xiaowei Yu,, Zhengliang Liu, Lei Guo, Dajiang Zhu, Tuo Zhang, Dinggang Shen, Tianming Liu,, Xiang Li

TL;DR
This paper introduces a novel framework that uses eye-gaze data from radiologists to explicitly improve the alignment of visual and textual features in medical multi-modal learning, leading to better generalization and state-of-the-art results.
Contribution
The paper proposes the EGMA framework that leverages eye-gaze data for explicit multi-modal alignment in medical imaging, enhancing performance and generalization.
Findings
EGMA achieves state-of-the-art results on four medical datasets.
Incorporating eye-gaze data improves model generalization across datasets.
Varying eye-gaze data amounts still benefit model performance.
Abstract
In the medical multi-modal frameworks, the alignment of cross-modality features presents a significant challenge. However, existing works have learned features that are implicitly aligned from the data, without considering the explicit relationships in the medical context. This data-reliance may lead to low generalization of the learned alignment relationships. In this work, we propose the Eye-gaze Guided Multi-modal Alignment (EGMA) framework to harness eye-gaze data for better alignment of medical visual and textual features. We explore the natural auxiliary role of radiologists' eye-gaze data in aligning medical images and text, and introduce a novel approach by using eye-gaze data, collected synchronously by radiologists during diagnostic evaluations. We conduct downstream tasks of image classification and image-text retrieval on four medical datasets, where EGMA achieved…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Radiology practices and education
MethodsFocus
