# Comparing large language models and search engine responses to common orthodontic questions

**Authors:** Yuanyuan Ren, Jing Sun

PMC · DOI: 10.1371/journal.pone.0339908 · PLOS One · 2026-01-02

## TL;DR

This study compares how well large language models and search engines answer common orthodontic questions, finding that language models perform better in quality, empathy, and readability.

## Contribution

The study is the first to evaluate LLMs and search engines for orthodontic questions using a multidimensional evaluation framework.

## Key findings

- LLMs scored higher than search engines in quality, empathy, readability, and satisfaction.
- LLM responses were rated better in therapeutic outcomes, appliance selection, and cost categories.
- GPT-4o outperformed other models and search engines in the evaluation.

## Abstract

Large Language Models (LLMs) highlight their potential in supporting patient education and self-management. Their performance in responses to orthodontic questions has yet to be explored.

This study aims to compare the quality, empathy, readability, and satisfaction of responses from LLMs and search engines on common orthodontic questions.

Forty-five common orthodontic questions (six categories) and a prompt were developed, and a self-designed multidimensional evaluation questionnaire was constructed. Questions were presented to 5 LLMs and 3 search engines on December,22,2024. The primary outcomes were the median expert-rated scores of LLMs versus search engine responses on quality, empathy, readability, and satisfaction, using 5- or 10-point Likert scales.

LLMs scored significantly higher than search engines in quality (4.00 vs. 3.50, p < 0.001), empathy (3.75 vs. 3.50, p < 0.001), readability (4.00 vs. 3.75, p < 0.001), and satisfaction (8.00 vs. 7.25, p < 0.001). LLM-generated responses were rated significantly higher than those from search engines in therapeutic outcomes category, appliance selection category and cost category.

In this cross-sectional study, the LLMs, particularly GPT-4o, outperformed search engines. These results indicate the potential of LLMs as supplementary tools for orthodontic patient education and self-management.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12758715/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12758715/full.md

## References

45 references — full list in the complete paper: https://tomesphere.com/paper/PMC12758715/full.md

---
Source: https://tomesphere.com/paper/PMC12758715