Enhancing Human-Computer Interaction in Chest X-ray Analysis using   Vision and Language Model with Eye Gaze Patterns

Yunsoo Kim; Jinge Wu; Yusuf Abdulle; Yue Gao; Honghan Wu

arXiv:2404.02370·cs.CV·April 4, 2024·1 cites

Enhancing Human-Computer Interaction in Chest X-ray Analysis using Vision and Language Model with Eye Gaze Patterns

Yunsoo Kim, Jinge Wu, Yusuf Abdulle, Yue Gao, Honghan Wu

PDF

Open Access

TL;DR

This paper introduces a novel human-AI interaction method for chest X-ray analysis by integrating eye gaze data into vision-language models, significantly improving diagnostic accuracy and enhancing AI-radiologist collaboration.

Contribution

It presents a new approach that combines eye gaze heatmaps with vision-language models to improve medical image analysis and radiologist-AI interaction.

Findings

01

Eye gaze data improves analysis accuracy across tasks.

02

Fine-tuning with gaze data outperforms other models.

03

Enhanced human-AI collaboration in medical diagnosis.

Abstract

Recent advancements in Computer Assisted Diagnosis have shown promising performance in medical imaging tasks, particularly in chest X-ray analysis. However, the interaction between these models and radiologists has been primarily limited to input images. This work proposes a novel approach to enhance human-computer interaction in chest X-ray analysis using Vision-Language Models (VLMs) enhanced with radiologists' attention by incorporating eye gaze data alongside textual prompts. Our approach leverages heatmaps generated from eye gaze data, overlaying them onto medical images to highlight areas of intense radiologist's focus during chest X-ray evaluation. We evaluate this methodology in tasks such as visual question answering, chest X-ray report automation, error detection, and differential diagnosis. Our results demonstrate the inclusion of eye gaze information significantly enhances…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEdcuational Technology Systems

MethodsFocus