CSEval: A Framework for Evaluating Clinical Semantics in Text-to-Image Generation
Robert Cronshaw, Konstantinos Vilouras, Junyu Yan, Yuning Du, Feng Chen, Steven McDonagh, Sotirios A. Tsaftaris

TL;DR
CSEval is a novel framework that uses language models to evaluate the clinical semantic accuracy of images generated from medical text prompts, addressing a gap in existing evaluation metrics.
Contribution
The paper introduces CSEval, a new framework that assesses clinical semantic alignment in medical text-to-image generation using language models, improving evaluation accuracy.
Findings
CSEval detects semantic inconsistencies missed by traditional metrics.
CSEval correlates well with expert clinical judgment.
Provides a scalable, clinically meaningful evaluation method.
Abstract
Text-to-image generation has been increasingly applied in medical domains for various purposes such as data augmentation and education. Evaluating the quality and clinical reliability of these generated images is essential. However, existing methods mainly assess image realism or diversity, while failing to capture whether the generated images reflect the intended clinical semantics, such as anatomical location and pathology. In this study, we propose the Clinical Semantics Evaluator (CSEval), a framework that leverages language models to assess clinical semantic alignment between the generated images and their conditioning prompts. Our experiments show that CSEval identifies semantic inconsistencies overlooked by other metrics and correlates with expert judgment. CSEval provides a scalable and clinically meaningful complement to existing evaluation methods, supporting the safe adoption…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Artificial Intelligence in Healthcare and Education
