Comparing Artificial Intelligence and Obstetrics Residents in Answering Standardized Patient Questions Regarding Gestational Diabetes

Azam Faraji; Hossein Faramarzi; Mahsa Razeghi; Nasrin Asadi; Homeira Vafaei; Maryam Kasraeian

PMC · DOI:10.7759/cureus.94662·October 15, 2025

Comparing Artificial Intelligence and Obstetrics Residents in Answering Standardized Patient Questions Regarding Gestational Diabetes

Azam Faraji, Hossein Faramarzi, Mahsa Razeghi, Nasrin Asadi, Homeira Vafaei, Maryam Kasraeian

PDF

Open Access

TL;DR

This study compared AI chatbots and medical residents in answering questions about gestational diabetes, finding that AI models performed better in accuracy and completeness.

Contribution

Demonstrates that AI models outperform residents in answering gestational diabetes questions, suggesting potential for medical education and clinical support.

Findings

01

AI models had significantly higher accuracy than residents in answering GDM-related questions.

02

GPT-4o and DeepSeek V3 0324 showed significantly higher completeness scores than residents.

03

DeepSeek V3 0324 achieved the highest scores for both accuracy and completeness.

Abstract

Introduction This study evaluated the performance of three artificial intelligence (AI) chatbots (GPT-3.5 (OpenAI, San Francisco, USA), GPT-4o (OpenAI, San Francisco, USA), and DeepSeek V3 0324 (DeepSeek AI, Beijing, China)) compared to eight gynecology residents in answering questions related to gestational diabetes mellitus (GDM), aiming to assess and compare the accuracy and completeness of responses to standardized patient questions on gestational diabetes in pregnancy. Methods Twenty-four questions were answered by three chatbots (GPT-3.5, GPT-4o, and DeepSeek V3 0324) and eight residents. Two faculty members independently rated the responses for accuracy and completeness using a 5-point scale. Independent-samples t-tests were used for statistical analysis. Results The mean accuracy scores were 3.64 for residents, 4.67 for GPT-3.5, 4.69 for GPT-4o, and 4.81 for DeepSeek V3…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Diseases2

gestational diabetes mellitus GDM

Figures2

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Healthcare · Artificial Intelligence in Healthcare and Education