A Detection-Gated Pipeline for Robust Glottal Area Waveform Extraction and Clinical Pathology Assessment

Harikrishnan Unnikrishnan; Rita Patel

arXiv:2603.02087·cs.CV·May 8, 2026

A Detection-Gated Pipeline for Robust Glottal Area Waveform Extraction and Clinical Pathology Assessment

Harikrishnan Unnikrishnan, Rita Patel

PDF

1 Repo 7 Models

TL;DR

This paper introduces a real-time, automated pipeline combining YOLOv8 and U-Net for accurate glottal area segmentation in high-speed videoendoscopy, enabling clinical pathology assessment and robust cross-dataset performance.

Contribution

A novel detection-gated segmentation framework that improves accuracy, generalizability, and speed for glottal area extraction in clinical settings.

Findings

01

Achieved high Dice scores on in-distribution datasets (0.81 and 0.856).

02

Demonstrated cross-dataset portability with 0.745 DSC without fine-tuning.

03

Clinical study showed the glottal area CV distinguishes healthy from pathological subjects (p=0.006).

Abstract

We present a fully automated, two-stage modular glottal area segmentation framework for high-speed videoendoscopy (HSV) designed for accuracy, generalizability, and real-time playback. Our detection-gated pipeline combines a YOLOv8n glottis localizer with a U-Net segmenter; the localizer defines a tight crop to ensure a consistent field of view and gates the output to reduce spurious segmentations during glottal closure. The models were trained on the GIRAFE (N=600) and BAGLS (N=55,750) datasets. Cross-dataset portability was evaluated by benchmarking GIRAFE-trained models on the BAGLS test set without fine-tuning. In these evaluations, the pipeline achieved a Dice Similarity Coefficient (DSC) of 0.745 (87% of the in-domain ceiling). On in-distribution test sets, the system achieved DSCs of 0.81 (GIRAFE) and 0.856 (BAGLS), outperforming or competing with state-of-the-art methods. An…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hari-krishnan/openglottal
github

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.