Weighting What Matters: Boosting Sample Efficiency in Medical Report Generation via Token Reweighting
Alexander Weers, Daniel Rueckert, Martin J. Menten

TL;DR
This paper proposes a token reweighting technique to improve data efficiency in medical report generation, focusing on clinically important tokens to reduce training data needs.
Contribution
It introduces a simple weighted loss function that emphasizes salient tokens, enhancing sample efficiency in vision-language models for medical reports.
Findings
Achieves similar report quality with up to ten times less training data.
Improves data efficiency across multiple scales in ophthalmological report generation.
Focuses training on clinically salient tokens for better performance.
Abstract
Training vision-language models (VLMs) for medical report generation is often hindered by the scarcity of high-quality annotated data. This work evaluates the use of a weighted loss function to improve data efficiency. Compared to standard cross-entropy loss, which treats all token prediction errors equally, the reweighted loss shifts the focus to semantically salient tokens with outsized clinical importance. In experiments on ophthalmological report generation, we show that this simple method improves efficiency across multiple data scales, achieving similar report quality with up to ten times less training data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
