CXR-CML: Improved zero-shot classification of long-tailed multi-label diseases in Chest X-Rays

Rajesh Madhipati; Sheethal Bhat; Lukas Buess; Andreas Maier

arXiv:2507.19398·cs.CV·July 28, 2025

CXR-CML: Improved zero-shot classification of long-tailed multi-label diseases in Chest X-Rays

Rajesh Madhipati, Sheethal Bhat, Lukas Buess, Andreas Maier

PDF

Open Access

TL;DR

This paper introduces a class-weighted, clustering-based method to improve zero-shot classification of long-tailed diseases in Chest X-Rays, significantly enhancing recognition of rare classes in imbalanced datasets.

Contribution

It proposes a novel class-weighting mechanism combined with GMM clustering and metric loss to improve zero-shot multi-label disease classification in chest X-ray images.

Findings

01

Achieved a 7% average increase in zero-shot AUC scores across 40 classes.

02

Effectively improves recognition of long-tailed, rare disease classes.

03

Demonstrates robustness and stability in feature clustering and classification.

Abstract

Chest radiography (CXR) plays a crucial role in the diagnosis of various diseases. However, the inherent class imbalance in the distribution of clinical findings presents a significant challenge for current self-supervised deep learning models. These models often fail to accurately classify long-tailed classes. Current Vision-Language models such as Contrastive Language Image Pre-training (CLIP) models effectively model the manifold distribution of the latent space, enabling high zero-shot classification accuracies. Although CLIP performs well on most of the primary classes in the dataset, our work reveals that its effectiveness decreases significantly for classes with a long-tailed distribution. Our approach employs a class-weighting mechanism that directly aligns with the distribution of classes within the latent space. This method ensures a substantial improvement in overall…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · AI in cancer detection · Domain Adaptation and Few-Shot Learning