# Assessing the Efficacy of Artificial Intelligence Platforms in Answering Dental Caries Multiple-Choice Questions: A Comparative Study of ChatGPT and Google Gemini Language Models

**Authors:** Amr Ahmed Azhari, Walaa Magdy Ahmed, Abdulaziz Alhamadani, Amal Alfaraj, Min Zhang, Chang-Tien Lu

PMC · DOI: 10.3390/dj14020072 · Dentistry Journal · 2026-01-27

## TL;DR

This study compared ChatGPT and Google Gemini's ability to answer dental caries multiple-choice questions, finding that Gemini performed significantly better across different exam lengths.

## Contribution

The novel contribution is a comparative evaluation of two LLMs in a simulated dental education setting with varying examination lengths.

## Key findings

- Gemini significantly outperformed ChatGPT in all seven examination formats (p < 0.001).
- Gemini had higher passing rates and mean scores across all question counts.
- Both models showed significant score variation with increasing exam length (p < 0.05).

## Abstract

Objective: This study aimed to compare the accuracy of two large language models (LLMs)—ChatGPT (version 3.5) and Google Gemini (formerly Bard)—in answering dental caries-related multiple-choice questions (MCQs) using a simulated student examination framework across seven examination lengths. Materials and Methods: A total of 125 validated dental caries MCQs were extracted from Dental Decks and Oxford University Press question banks. Seven examination groups were constructed with varying question counts (25, 35, 45, 55, 65, 75, and 85 questions). For each group, 100 simulations were generated per LLM (ChatGPT and Gemini), resulting in 1400 simulated examinations. Each simulated student received a unique randomized subset of questions. MCQs were answered by each LLM using a standardized prompt to minimize ambiguity. Outcomes included mean score, passing rate (≥60%), and performance differences between LLMs. Statistical analyses included independent t-tests, one-way ANOVA within each LLM, and two-way ANOVA examining interactions between LLM type and question count. Results: Across all seven examination formats, Gemini significantly outperformed ChatGPT (p < 0.001). Gemini achieved higher passing rates and higher mean scores in every examination length. One-way ANOVA revealed significant score variation with increasing exam length for both LLMs (p < 0.05). Two-way ANOVA demonstrated significant main effects of LLM type and question count, with no significant interaction. Randomization had no measurable effect on Gemini performance but influenced ChatGPT scores. Conclusions: Gemini demonstrated superior accuracy and higher passing rates compared to ChatGPT in all simulated examination formats. While both LLMs struggled with complex caries-related content, Gemini provided more reliable performance across question quantities. Educators should exercise caution in relying on LLMs for automated assessment or self-study, and future research should evaluate human–AI hybrid models and LLM performance across broader dental domains.

## Linked entities

- **Diseases:** dental caries (MONDO:0005276)

## Full-text entities

- **Diseases:** bacterial infection (MESH:D001424), Dental Caries (MESH:D003731), hallucination (MESH:D006212), LLMs (MESH:D007806), injury to (MESH:D014947), tooth loss (MESH:D016388)
- **Chemicals:** ChatGPT (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Bacteria Latreille et al. 1825 (Bacteria stick insect, genus) [taxon 629395], Lactobacillus acidophilus (species) [taxon 1579], Veillonella parvula (species) [taxon 29466], Streptococcus mutans (species) [taxon 1309], Actinomyces naeslundii (species) [taxon 1655]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12939131/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12939131/full.md

## References

55 references — full list in the complete paper: https://tomesphere.com/paper/PMC12939131/full.md

---
Source: https://tomesphere.com/paper/PMC12939131