# Joint Imbalance Adaptation for Radiology Report Generation

**Authors:** Wang Li, Guangzeng Han, Yuexin Wu, I.-Chan Huang, Xiaolei Huang

PMC · DOI: 10.1007/s41666-025-00205-9 · Journal of Healthcare Informatics Research · 2025-06-20

## TL;DR

This paper introduces a new method to improve radiology report generation by addressing data imbalance issues in medical tokens and labels.

## Contribution

The novel JIMA model uses a hard-to-easy learning strategy to reduce overfitting on frequent patterns and improve performance on infrequent medical terms.

## Key findings

- JIMA improves evaluation metrics by 16.75–50.50% on radiology datasets.
- The model enhances performance on infrequent tokens and abnormal radiological entries.
- Human evaluations confirm improved clinical accuracy of generated reports.

## Abstract

Radiology report generation, translating radiological images into precise and clinically relevant description, may face the data imbalance challenge — medical tokens appear less frequently than regular tokens, and normal entries are significantly more than abnormal ones. However, very few studies consider the imbalance issues, not even with conjugate imbalance factors. In this study, we propose a Joint Imbalance Adaptation (JIMA) model to promote task robustness by leveraging token and label imbalance. We employ a hard-to-easy learning strategy that mitigates overfitting to frequent labels and tokens, thereby encouraging the model to focus more on infrequent labels and clinical tokens. JIMA presents notable improvements (16.75–50.50% on average) across evaluation metrics on IU X-ray and MIMIC-CXR datasets. Our ablation analysis and human evaluations show the improvements mainly come from enhancing performance on infrequent tokens and abnormal radiological entries, which can also lead to more clinically accurate reports. While data imbalance (e.g., infrequent tokens and abnormal labels) can lead to the underperformance of radiology report generation, our imbalance learning strategy opens promising directions on how to encounter data imbalance by reducing overfitting on frequent patterns and underfitting on infrequent patterns.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12602799/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12602799/full.md

## References

13 references — full list in the complete paper: https://tomesphere.com/paper/PMC12602799/full.md

---
Source: https://tomesphere.com/paper/PMC12602799