ChatGPT versus UpToDate in Preclinical Medical Education: Cross-Sectional Analysis Using Term Frequency–Inverse Document Frequency Cosine Similarity
Shankar S Thiru, Nicholas E Aksu, Matthew Chiang, Daniel O Gallagher, Mary Furlong, Elizabeth R Prevou, Akhil Jay Khanna

TL;DR
This study compares ChatGPT and UpToDate responses to preclinical medical questions, finding ChatGPT moderately aligns with UpToDate, especially in pharmacology.
Contribution
The study introduces a TF-IDF cosine similarity method to evaluate ChatGPT's alignment with evidence-based medical resources like UpToDate.
Findings
ChatGPT showed significant similarity to UpToDate in 59.3% of preclinical questions.
Pharmacology had the highest mean cosine similarity (0.338) among subjects.
Similarity scores exceeded randomized text, indicating nonrandom alignment.
Abstract
Generative artificial intelligence tools such as ChatGPT are increasingly used by medical students for self-directed learning. Although these models demonstrate linguistic fluency, their reliability as supplementary resources for preclinical education remains uncertain. In particular, comparisons with evidence-based references such as UpToDate are lacking. This study evaluated the similarity between responses generated by ChatGPT (with GPT-4o mini) and those from UpToDate to preclinical medical education questions to assess ChatGPT’s potential as an adjunctive learning tool. We conducted a cross-sectional comparison study using 150 first-order questions derived from a preclinical question bank at a single allopathic institution under the oversight of a medical educator with more than 25 years of teaching experience. Each question was entered into ChatGPT 10 times in separate chat…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Healthcare · Radiomics and Machine Learning in Medical Imaging
