Mining Gaze for Contrastive Learning toward Computer-Assisted Diagnosis
Zihao Zhao, Sheng Wang, Qian Wang, Dinggang Shen

TL;DR
This paper introduces McGIP, a novel contrastive pre-training method using radiologist gaze data instead of reports, enhancing medical image diagnosis models by leveraging passive eye-tracking signals.
Contribution
It proposes a new gaze-based contrastive learning approach, McGIP, as an effective plug-and-play module for medical image analysis without relying on large-scale reports.
Findings
McGIP improves diagnostic accuracy in medical imaging tasks.
Gaze-based positive pair selection enhances contrastive pre-training.
Method demonstrates high potential across different medical image types.
Abstract
Obtaining large-scale radiology reports can be difficult for medical images due to various reasons, limiting the effectiveness of contrastive pre-training in the medical image domain and underscoring the need for alternative methods. In this paper, we propose eye-tracking as an alternative to text reports, as it allows for the passive collection of gaze signals without disturbing radiologist's routine diagnosis process. By tracking the gaze of radiologists as they read and diagnose medical images, we can understand their visual attention and clinical reasoning. When a radiologist has similar gazes for two medical images, it may indicate semantic similarity for diagnosis, and these images should be treated as positive pairs when pre-training a computer-assisted diagnosis (CAD) network through contrastive learning. Accordingly, we introduce the Medical contrastive Gaze Image Pre-training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGaze Tracking and Assistive Technology · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques
MethodsContrastive Learning
