Accuracy and reliability of Manus, ChatGPT, and Claude in case-based dental diagnosis
Ahmed A. Madfa, Abdullah F. Alshammari, Bassam A. Anazi, Yousef E. Alenezi, Khlood A. Alkurdi

TL;DR
This study compares the diagnostic accuracy and consistency of three AI models—ChatGPT, Claude, and Manus—in dental case scenarios.
Contribution
The study evaluates the performance of emerging AI platforms like Manus in dental diagnosis, an area previously underexplored.
Findings
Claude and Manus showed higher diagnostic accuracy (92.3%) than ChatGPT (76.9%) in dental scenarios.
Claude and Manus also demonstrated greater intra-model consistency compared to ChatGPT.
Despite numerical advantages, differences between models were not statistically significant.
Abstract
Artificial intelligence (AI), particularly large language models (LLMs), is transforming healthcare education and clinical decision-making. While models like ChatGPT and Claude have demonstrated utility in medical contexts, their performance in dental diagnostics remains underexplored; additionally, the potential of emerging platforms, like Manus, is yet to be evaluated. To compare the diagnostic accuracy and consistency of the ChatGPT, Claude, and Manus—using authentic, case-based dental scenarios. A set of 117 multiple-choice questions based on validated clinical dental vignettes spanning various specialities was administered to each model under standardised conditions at two separate time points. Responses were scored against expert-validated answer keys. Inter-rater reliability was assessed using Cohen's kappa, and statistical comparisons were made using the chi-square, McNemar,…
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Clinical Reasoning and Diagnostic Skills · Dental Radiography and Imaging
