# CBCT-Based Orthodontic Classification Using Commercial AI: Completeness and Accuracy in Independent Validation

**Authors:** Natalia Kazimierczak, Nora Sultani, Szymon Krzykowski, Zbigniew Serafin, Wojciech Kazimierczak

PMC · DOI: 10.3390/jcm15041637 · 2026-02-21

## TL;DR

A commercial AI tool for orthodontic diagnosis using CBCT scans was found to be unreliable, with most patients receiving no usable results.

## Contribution

The study is the first to evaluate the Diagnocat platform's diagnostic reliability for CBCT-based orthodontic assessments.

## Key findings

- The AI platform generated skeletal and vertical classifications for only 5% and 1.7% of patients, respectively.
- Agreement for overbite categorization was fair (κ = 0.324), and Dental Angle class was provided for 57.6% of patients.
- Overall system usability was below 10% for skeletal parameters when 'N/A' outputs were considered failures.

## Abstract

Background/Objectives: Artificial intelligence (AI) tools for orthodontic diagnosis are increasingly used in clinical practice; however, there is limited evidence regarding their performance in CBCT-based assessments. In this study, we evaluated the diagnostic reliability of the Diagnocat platform for categorical orthodontic diagnoses obtained from CBCT examinations. Methods: Fifty-nine patients who underwent large-field CBCT (13 × 16 cm) and lateral cephalograms within 30 days were included, and CBCT scans were processed using Diagnocat (v1.0). The platform’s categorical outputs—sagittal skeletal class, vertical facial pattern, overbite category, and Dental Angle class—were compared with manual cephalometric analyses performed by an experienced orthodontist (reference standard). Standard thresholds were used to convert reference continuous measurements into categorical variables. Missing or ‘N/A’ index test outputs were treated as diagnostic failures in accordance with STARD recommendations. Agreement was assessed via Cohen’s kappa (κ), and the sensitivity, specificity, PPV, and NPV were calculated for angle classification. Results: The AI platform generated skeletal and vertical classifications in only 3/59 (5%) and 1/59 (1.7%) patients, respectively. Agreement was fair (κ = 0.324) for overbite categorization, and the Dental Angle class was provided for 34/59 (57.6%) patients. When “N/A” results were treated as diagnostic failures, the overall system usability was <10% for skeletal parameters. Conclusions: The platform demonstrated insufficient diagnostic reliability and failed to generate outputs for most patients. While the specificities for generated diagnoses were acceptable, the low data completeness rate renders the tool currently unsuitable for independent clinical decision-making.

## Full-text entities

- **Diseases:** Angle's malocclusion (MESH:D008310), III (MESH:C537189), cleft lip/palate (MESH:D002971), II (MESH:C537730), Overbite (MESH:D057887), craniofacial pathologies (MESH:D005598), Class II and Class III malocclusions (MESH:D008313), injury to (MESH:D014947), AI (MESH:C538142), craniofacial syndromes (MESH:C565118)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12941958/full.md

---
Source: https://tomesphere.com/paper/PMC12941958