# Consistency evaluation and performance optimization of deep learning-based auto-contouring for nasopharyngeal carcinoma

**Authors:** Linghui Yan, Yuhao Lin, Zirong Li, Liuling Wang, Xiaoting Lin, Jianming Ding, Zixuan Leng, Qichao Zhou, Chuanben Chen, Zhaodong Fei

PMC · DOI: 10.1038/s41598-025-33567-6 · Scientific Reports · 2025-12-23

## TL;DR

This paper evaluates and improves deep learning models for contouring in nasopharyngeal carcinoma radiotherapy, highlighting current limitations and proposed solutions.

## Contribution

The study introduces two novel model frameworks to enhance the accuracy and reliability of automatic contouring in radiotherapy.

## Key findings

- DL models showed poor consistency in delineating GTV, pituitary gland, and temporomandibular joints.
- Proposed frameworks improved contouring accuracy for GTV, brainstem, and eyes.
- DL-generated contours still differ significantly from manual delineations by oncologists for critical structures.

## Abstract

Contour delineation is crucial for ensuring the efficacy and side effects of radiotherapy (RT), but it inevitably involves inter-observer variability (IOV). Deep learning (DL) models have been used to assist in contour delineation, but further evaluation is needed to guide healthcare professionals in the judicious application of DL models. The contours of 22 anatomical structures and the gross tumor volume (GTV) for 30 patients with nasopharyngeal carcinoma were delineated using four DL models: AccuContour, RT-Viewer-contour, RT-Mind, and PVmed Contouring. The overall kappa values and generalized conformity indices of these contours were calculated to assess consistency. The Dice similarity coefficient (DSC), Relative Volume Difference (RVD), 95th percentile Hausdorff Distance (HD95), and Average Symmetric Surface Distance (ASSD) were calculated to evaluate the accuracy of the contours. Additionally, two innovative model frameworks were introduced to improve the fidelity and reliability of patient contour delineation. The consistency of the contours generated by the four DL models was poor for GTV, pituitary gland, temporal lobes, and temporomandibular joints. Marked differences were still observed between the contours generated by the models and the manual delineations by oncologists for the GTV, lens, optic nerves, pituitary glands temporomandibular joints, temporal lobes, and trachea. The model frameworks we proposed can effectively optimize the contours of GTV, brainstem, eyes, lens, and temporomandibular joints. The contours generated by DL models still have deficiencies in the application of nasopharyngeal carcinoma radiotherapy. To address this, two model frameworks were proposed to increase the robustness of automatic contouring.

The online version contains supplementary material available at 10.1038/s41598-025-33567-6.

## Linked entities

- **Diseases:** nasopharyngeal carcinoma (MONDO:0015459)

## Full-text entities

- **Diseases:** tumor (MESH:D009369), nasopharyngeal carcinoma (MESH:D000077274)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12848034/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12848034/full.md

## References

2 references — full list in the complete paper: https://tomesphere.com/paper/PMC12848034/full.md

---
Source: https://tomesphere.com/paper/PMC12848034