Evaluating Vision Language Model Adaptations for Radiology Report Generation in Low-Resource Languages
Marco Salm\`e, Rosa Sicilia, Paolo Soda, Valerio Guarrasi

TL;DR
This paper evaluates the performance of instruction-tuned Vision-Language Models in generating radiology reports across low-resource languages, emphasizing the importance of linguistic and domain-specific adaptations for improved accuracy.
Contribution
It introduces a comprehensive benchmark for assessing VLMs in low-resource radiology report generation and analyzes effective adaptation strategies for multilingual healthcare applications.
Findings
Language-specific models outperform general and domain-specific models.
Fine-tuning with medical terminology improves report quality.
Model temperature influences report coherence.
Abstract
The integration of artificial intelligence in healthcare has opened new horizons for improving medical diagnostics and patient care. However, challenges persist in developing systems capable of generating accurate and contextually relevant radiology reports, particularly in low-resource languages. In this study, we present a comprehensive benchmark to evaluate the performance of instruction-tuned Vision-Language Models (VLMs) in the specialized task of radiology report generation across three low-resource languages: Italian, German, and Spanish. Employing the LLaVA architectural framework, we conducted a systematic evaluation of pre-trained models utilizing general datasets, domain-specific datasets, and low-resource language-specific datasets. In light of the unavailability of models that possess prior knowledge of both the medical domain and low-resource languages, we analyzed various…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
