Comparing Real and ChatGPT-Generated Radiographs for Training Deep Learning Models to Diagnose Knee Osteoarthritis
Rohan R Datir, Akshay Reddy, Yash Bhatia, Niall Baig, Roman Barrozo, Vinayak Sathe

TL;DR
This study compares AI models trained on real and ChatGPT-generated knee X-rays for diagnosing osteoarthritis, finding that synthetic images can help but are not enough on their own.
Contribution
The novel contribution is evaluating ChatGPT-generated radiographs as a supplement to real data for training AI in knee osteoarthritis diagnosis.
Findings
Models trained on real radiographs (Model B) and a mix of real and synthetic images (Model C) outperformed those trained solely on synthetic data (Model A).
Model C showed slightly better discrimination (AUROC 0.782) than Model B (0.758), though confidence intervals overlapped.
Synthetic images improved grade-specific sensitivity but not significantly after statistical adjustment.
Abstract
Introduction: Osteoarthritis (OA) is a degenerative joint disease characterized by progressive cartilage loss, bone remodeling, and chronic pain. The growing global burden of OA motivates the evaluation of artificial intelligence (AI) approaches for automating radiographic diagnosis. Purpose: This study aimed to compare AI models trained on real radiographs, ChatGPT-generated radiographs, and a combined dataset to assess whether synthetic imaging can improve OA detection. Methods: Three binary classifiers were trained using knee radiographs: Model A (ChatGPT-generated images only), Model B (real images only), and Model C (real + synthetic). All models were developed using PyTorch in Google Colab and evaluated on 1,656 held-out real radiographs. Performance metrics (accuracy, sensitivity, specificity, precision, F1 score, and AUROC (area under the receiver operating characteristic))…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · COVID-19 diagnosis using AI · Radiomics and Machine Learning in Medical Imaging
