A Dataset of Laryngeal Endoscopic Images with Comparative Study on Convolution Neural Network Based Semantic Segmentation
Max-Heinrich Laves, Jens Bicker, L\"uder A. Kahrs, Tobias Ortmaier

TL;DR
This study evaluates CNN-based segmentation methods on a new dataset of laryngeal endoscopic images, demonstrating their potential for medical diagnosis and intervention with high accuracy and efficiency.
Contribution
It introduces a novel 7-class dataset of laryngeal images and compares multiple CNN architectures, including ensemble methods, for semantic segmentation in endoscopic images.
Findings
Best segmentation accuracy achieved with UNet and ErfNet ensemble (84.7% IoU)
ENet demonstrated the fastest inference time (9.22 ms per image)
Patient-specific fine-tuning effective with only 10 additional images
Abstract
Purpose Automated segmentation of anatomical structures in medical image analysis is a prerequisite for autonomous diagnosis as well as various computer and robot aided interventions. Recent methods based on deep convolutional neural networks (CNN) have outperformed former heuristic methods. However, those methods were primarily evaluated on rigid, real-world environments. In this study, existing segmentation methods were evaluated for their use on a new dataset of transoral endoscopic exploration. Methods Four machine learning based methods SegNet, UNet, ENet and ErfNet were trained with supervision on a novel 7-class dataset of the human larynx. The dataset contains 536 manually segmented images from two patients during laser incisions. The Intersection-over-Union (IoU) evaluation metric was used to measure the accuracy of each method. Data augmentation and network ensembling were…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVoice and Speech Disorders · Lung Cancer Diagnosis and Treatment · Head and Neck Cancer Studies
MethodsDilated Convolution · 1x1 Convolution · Convolution · ENet Dilated Bottleneck · ENet Bottleneck · ENet Initial Block · Kaiming Initialization · Batch Normalization · *Communicated@Fast*How Do I Communicate to Expedia? · SpatialDropout
