# Deep Learning-Based Semantic Segmentation and Classification of Otoscopic Images for Otitis Media Diagnosis and Health Promotion

**Authors:** Chien-Yi Yang, Che-Jui Lee, Wen-Sen Lai, Kuan-Yu Chen, Chung-Feng Kuo, Chieh Hsing Liu, Shao-Cheng Liu

PMC · DOI: 10.3390/diagnostics16030467 · Diagnostics · 2026-02-02

## TL;DR

This paper presents an AI system that accurately classifies otitis media using otoscopic images, improving diagnostic consistency and supporting health screening.

## Contribution

A semi-supervised AI framework combining semantic segmentation and classification for automated otitis media diagnosis.

## Key findings

- U-Net achieved 96.76% pixel accuracy in segmenting tympanic membrane structures.
- The framework reached 100% accuracy for normal and AOM cases, and 91.3% for COM.
- The system offers a fully automated and clinically interpretable diagnostic solution.

## Abstract

Background/Objectives: Otitis media (OM), including acute otitis media (AOM) and chronic otitis media (COM), is a common middle ear disease that can lead to significant morbidity if not accurately diagnosed. Otoscopic interpretation remains subjective and operator-dependent, underscoring the need for objective and reproducible diagnostic support. Recent advances in artificial intelligence (AI) offer promising solutions for automated otoscopic image analysis. Methods: We developed an AI-based diagnostic framework consisting of three sequential steps: (1) semi-supervised learning for automatic recognition and semantic segmentation of tympanic membrane structures, (2) region-based feature extraction, and (3) disease classification. A total of 607 clinical otoscopic images were retrospectively collected, including normal ears (n = 220), AOM (n = 157), and COM with tympanic membrane perforation (n = 230). Among these, 485 images were used for training and 122 for independent testing. Semantic segmentation of five anatomically relevant regions was performed using multiple convolutional neural network architectures, including U-Net, PSPNet, HRNet, and DeepLabV3+. Following segmentation, color and texture features were extracted from each region and used to train a neural network-based classifier to differentiate disease states. Results: Among the evaluated segmentation models, U-Net demonstrated superior performance, achieving an overall pixel accuracy of 96.76% and a mean Dice similarity coefficient of 71.68%. The segmented regions enabled reliable extraction of discriminative chromatic and texture features. In the final classification stage, the proposed framework achieved diagnostic accuracies of 100% for normal ears, 100% for AOM, and 91.3% for COM on the independent test set, with an overall accuracy of 96.72%. Conclusions: This study demonstrates that a semi-supervised, segmentation-driven AI pipeline integrating feature extraction and classification can achieve high diagnostic accuracy for otitis media. The proposed framework offers a clinically interpretable and fully automated approach that may enhance diagnostic consistency, support clinical decision-making, and facilitate scalable otoscopic assessment in diverse healthcare screening settings for disease prevention and health education.

## Linked entities

- **Diseases:** otitis media (MONDO:0005441), acute otitis media (MONDO:0024330), chronic otitis media (MONDO:0021204)

## Full-text entities

- **Diseases:** AOM (MESH:D010033), tympanic membrane perforation (MESH:D018058)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12896723/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12896723/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/PMC12896723/full.md

---
Source: https://tomesphere.com/paper/PMC12896723