# Evaluation of Artificial Intelligence's Ability to Explore Literature on Orthopedic Navigation and Related Surgical Anatomy

**Authors:** Dimitrios Chytas, Angelo V Vasiliadis, Ethan Choucroun, Tanisha Naresh Chindore, Taha Ouhenach, Derin Eva Sadiq, Michael-Alexander Malahias

PMC · DOI: 10.7759/cureus.94768 · Cureus · 2025-10-17

## TL;DR

This study tested how well AI tools like ChatGPT and ScholarGPT can find and summarize research on orthopedic navigation and surgical anatomy.

## Contribution

The study evaluates the performance of ChatGPT and ScholarGPT in identifying and summarizing orthopedic literature, revealing their limitations and biases.

## Key findings

- ChatGPT identified studies well but summarized them with 40-60% accuracy.
- ScholarGPT performed poorly in both identifying and summarizing studies.
- Both AI tools showed a bias toward augmented reality-based navigation.

## Abstract

Introduction

Artificial intelligence has recently garnered increased research interest in orthopedics; yet, its role in the exploration of orthopedic literature remains unknown. We aimed to evaluate the ability of ChatGPT and ScholarGPT to identify and outline literature on orthopedic navigation and the visualization of related surgical anatomy.

Methods

We asked ChatGPT and ScholarGPT to list and summarize five studies: 1) about augmented reality-based navigation in orthopedic surgery, 2) about how well augmented reality-based navigation enabled anatomical accuracy in orthopedic surgery, and 3) which compared augmented reality-based navigation with another navigation technique in orthopedic surgery. Regarding each query, we evaluated how many studies were correctly detected and accurately summarized.

Results

ChatGPT scored excellently in identifying studies across all three queries. However, its performances in accurately summarizing these studies were 40%, 60%, and 60% respectively. On the other hand, ScholarGPT’s performances in identifying were 60%, 40%, and 0%, respectively, while in summarization, they were 0%, 20%, and 0%, respectively. Both platforms exhibited bias in favor of augmented reality-based navigation across all queries.

Conclusion

ChatGPT and ScholarGPT are not yet able to provide researchers with reliable data from the literature on orthopedic navigation and related surgical anatomy. Ongoing artificial intelligence development may essentially reinforce these platforms’ potential to play a more significant role in orthopedic research.

## Full-text entities

- **Diseases:** fracture (MESH:D050723), trauma (MESH:D014947), HTO (MESH:D020429), anterior cruciate ligament (MESH:D000070598), blood loss (MESH:D016063), tumor (MESH:D009369), total hip arthroplasty (MESH:D025981), neurological deficits (MESH:D009461), spinal deformity (MESH:D013122), Tibial Plateau Fractures (MESH:D000092463), infection (MESH:D007239), perforations (MESH:D057112), pelvic fracture (MESH:D034161), Musculoskeletal Disorders (MESH:D009140)
- **Chemicals:** ARSN (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12619905/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/PMC12619905/full.md

---
Source: https://tomesphere.com/paper/PMC12619905