Enhancing Human-Computer Interaction in Chest X-ray Analysis using Vision and Language Model with Eye Gaze Patterns
Yunsoo Kim, Jinge Wu, Yusuf Abdulle, Yue Gao, Honghan Wu

TL;DR
This paper introduces a novel human-AI interaction method for chest X-ray analysis by integrating eye gaze data into vision-language models, significantly improving diagnostic accuracy and enhancing AI-radiologist collaboration.
Contribution
It presents a new approach that combines eye gaze heatmaps with vision-language models to improve medical image analysis and radiologist-AI interaction.
Findings
Eye gaze data improves analysis accuracy across tasks.
Fine-tuning with gaze data outperforms other models.
Enhanced human-AI collaboration in medical diagnosis.
Abstract
Recent advancements in Computer Assisted Diagnosis have shown promising performance in medical imaging tasks, particularly in chest X-ray analysis. However, the interaction between these models and radiologists has been primarily limited to input images. This work proposes a novel approach to enhance human-computer interaction in chest X-ray analysis using Vision-Language Models (VLMs) enhanced with radiologists' attention by incorporating eye gaze data alongside textual prompts. Our approach leverages heatmaps generated from eye gaze data, overlaying them onto medical images to highlight areas of intense radiologist's focus during chest X-ray evaluation. We evaluate this methodology in tasks such as visual question answering, chest X-ray report automation, error detection, and differential diagnosis. Our results demonstrate the inclusion of eye gaze information significantly enhances…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEdcuational Technology Systems
MethodsFocus
