The OB-GYN Take on GPT: Objective Assessment of Artificial Intelligence Models in Patient Education
Lindsey Burleson, Ella Boardley, Alexandra LaShell, Anthony Shanks

TL;DR
This study evaluates how well AI models and ACOG FAQs perform in answering common OB-GYN patient questions, finding that AI can support patient education.
Contribution
The study is the first to assess LLMs specifically for OB-GYN patient education and compares them to ACOG resources.
Findings
CoPilot scored highest in expert evaluations for OB-GYN patient questions.
Evaluators often recommended ACOG FAQs or ChatGPT for clinical counseling.
LLMs show potential as tools for patient education in OB-GYN.
Abstract
Large language models (LLMs), a subset of generative artificial intelligence, have garnered significant attention in healthcare. While existing literature has examined the role of LLMs in medical education, primarily focusing on their ability to answer examination questions, few studies have investigated their potential in patient education and assessed the role of LLMs in medical education, few studies evaluate their role in patient education, and none have specifically evaluated their use in OB-GYN. Using a standardized rubric, we assessed the responses of three LLMs and the American College of Obstetricians and Gynecologists (ACOG) on 10 clinical questions that reflect common patient concerns in OB-GYN, which broadly represent common patient queries in OB-GYN. CoPilot by Microsoft demonstrated higher scores by expert reviewers; however, evaluators were more likely to recommend ACOG…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Healthcare · Clinical Reasoning and Diagnostic Skills
