GazeLT: Visual attention-guided long-tailed disease classification in chest radiographs

Moinak Bhattacharya; Gagandeep Singh; Shubham Jain; Prateek Prasanna

arXiv:2508.09478·cs.CV·August 14, 2025

GazeLT: Visual attention-guided long-tailed disease classification in chest radiographs

Moinak Bhattacharya, Gagandeep Singh, Shubham Jain, Prateek Prasanna

PDF

TL;DR

GazeLT leverages radiologist eye gaze data with a novel attention mechanism to significantly improve long-tailed disease classification accuracy in chest radiographs, capturing both major and minor findings.

Contribution

This work introduces GazeLT, a new deep learning framework that incorporates temporal visual attention patterns from radiologists to enhance long-tailed disease classification.

Findings

01

GazeLT outperforms existing methods by 4.1% in average accuracy.

02

It achieves a 21.7% improvement over visual attention baselines.

03

Validated on two large public datasets, demonstrating robustness.

Abstract

In this work, we present GazeLT, a human visual attention integration-disintegration approach for long-tailed disease classification. A radiologist's eye gaze has distinct patterns that capture both fine-grained and coarser level disease related information. While interpreting an image, a radiologist's attention varies throughout the duration; it is critical to incorporate this into a deep learning framework to improve automated image interpretation. Another important aspect of visual attention is that apart from looking at major/obvious disease patterns, experts also look at minor/incidental findings (few of these constituting long-tailed classes) during the course of image interpretation. GazeLT harnesses the temporal aspect of the visual search process, via an integration and disintegration mechanism, to improve long-tailed disease classification. We show the efficacy of GazeLT on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.