TL;DR
SemEnrich introduces a self-supervised semantic clustering method to enrich radiology reports, improving vision-language model performance by adding positive/neutral findings and integrating cluster info into training rewards.
Contribution
The paper presents a novel semantic clustering approach for enriching medical reports and enhancing vision-language learning, with demonstrated performance improvements.
Findings
Achieved average gains of 5.63% on COMET score and 7.47% on RadGraph-F1.
Semantic clustering outperforms random augmentation in improving model performance.
Incorporating cluster info into reward design further boosts scores by up to 12.80%.
Abstract
Medical vision-language datasets are often limited in size and biased toward negative findings, as clinicians report abnormalities mostly but might omit some positive/neutral findings because they might be considered as irrelevant to the patient's condition. We propose a self-supervised data enrichment method that leverages semantic clustering of report sentences. Then we enrich the findings in the medical reports in the training set by adding positive/neutral observations from different clusters in a self-supervised manner. Our approach yields consistent gains in supervised fine-tuning (5.63%, 3.04%, 7.40%, 5.30%, 7.47% average gains on COMET score, Bert score, Sentence Bleu, CheXbert-F1 and RadGraph-F1 scores respectively). Ablation studies confirm that improvements stem from semantic clustering rather than random augmentation. Furthermore, we introduce a way to incorporate semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
