Recognizing Pneumonia in Real-World Chest X-rays with a Classifier Trained with Images Synthetically Generated by Nano Banana
Jiachuan Peng, Kyle Lam, Jianing Qiu

TL;DR
This study demonstrates that training a pneumonia classifier solely on synthetically generated chest X-ray images can achieve high accuracy on real-world datasets, highlighting the potential of synthetic data in medical AI.
Contribution
The paper introduces a novel approach of using Nano Banana-generated synthetic images to train a pneumonia classifier, achieving strong performance without real training data.
Findings
Achieved AUROC of 0.923 on RSNA dataset
Achieved AUROC of 0.824 on Chest X-Ray dataset
Demonstrated feasibility of synthetic data for medical AI
Abstract
We trained a classifier with synthetic chest X-ray (CXR) images generated by Nano Banana, the latest AI model for image generation and editing, released by Google. When directly applied to real-world CXRs having only been trained with synthetic data, the classifier achieved an AUROC of 0.923 (95% CI: 0.919 - 0.927), and an AUPR of 0.900 (95% CI: 0.894 - 0.907) in recognizing pneumonia in the 2018 RSNA Pneumonia Detection dataset (14,863 CXRs), and an AUROC of 0.824 (95% CI: 0.810 - 0.836), and an AUPR of 0.913 (95% CI: 0.904 - 0.922) in the Chest X-Ray dataset (5,856 CXRs). These external validation results on real-world data demonstrate the feasibility of this approach and suggest potential for synthetic data in medical AI development. Nonetheless, several limitations remain at present, including challenges in prompt design for controlling the diversity of synthetic CXR data and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · AI in cancer detection · Cell Image Analysis Techniques
