# Identifying Volar Locking Plates on Plain Radiographs: Can Artificial Intelligence Models ‘Beat’ Clinicians?

**Authors:** Allen Albert, Alex Nicholls, Duncan Avis, Medhat Zekry, Matay Arsan, Priyanshu Saha, Adam Stoneham

PMC · DOI: 10.7759/cureus.98715 · 2025-12-08

## TL;DR

This study compares the ability of an AI model and hand surgeons to identify orthopedic implants in X-rays, finding that surgeons perform significantly better.

## Contribution

The study evaluates the performance of ChatGPT 5 in identifying orthopedic implants, showing that clinicians outperform the AI model.

## Key findings

- ChatGPT 5 correctly identified only 5.8% of volar locking plates in radiographs.
- Hand consultants had a mean accuracy of 30.8%, significantly higher than the AI model.
- Human consultants had 7.26 times higher odds of correct identification compared to ChatGPT 5.

## Abstract

Background

Hand surgeons are frequently required to identify volar locking plates on plain radiographs. This is important, for example, when planning what equipment is required for revision, implant removal or periprosthetic fractures, but it can be challenging, especially if surgery took place in another hospital or even country. Artificial intelligence (AI) clearly has potential in medical image recognition, but its role in orthopaedic implant identification currently remains uncertain. This study compared the performance of an openly available AI model, ChatGPT 5, with that of experienced hand consultants.

Methods

Fifty-two radiographs of distal radius plates from 10 major implant manufacturers were obtained from open-access sources. An AI programme (ChatGPT 5) and five hand consultants independently identified the manufacturer for each radiograph. Accuracy was calculated for each rater. Pairwise comparisons between AI and each consultant were assessed using McNemar’s test, and a standard logistic regression with clustered standard errors was fitted to compare AI with consultants as a group.

Results

ChatGPT5 correctly identified just 3 of 52 radiographs (5.8% accuracy). Consultant accuracies ranged between 13.5 and 46.2% (mean 30.8% ± 11.1). McNemar’s test showed that Consultants 1, 3, 4 and 5 significantly outperformed the AI (p < 0.01), while Consultant 2 did not (p = 0.289). In a standard logistic regression with clustered standard errors, the human cohort had 7.26 times higher odds of correct identification compared with the AI (OR 7.26, 95% CI 2.27-23.18, p < 0.001).

Conclusion

Identifying volar locking distal radius plates from plain radiographs remains difficult, even for experienced surgeons and even the ‘best’ consultant identified under 50% correctly. That notwithstanding, humans were on average seven times better than ChatGPT 5 which only identified just over 5% correctly. While current non-specialised AI tools are not suitable for implant identification currently, dedicated AI models trained on curated orthopaedic datasets may hold promise for future clinical use.

## Full-text entities

- **Diseases:** fractures (MESH:D050723)
- **Species:** Homo sapiens (human, species) [taxon 9606]

---
Source: https://tomesphere.com/paper/PMC12777850