Automated Temporal Segmentation of Orofacial Assessment Videos
Saeid Alavi Naeini, Leif Simmatis, Deniz Jafari, Diego L. Guarin, Yana, Yunusova, Babak Taati

TL;DR
This paper compares two automated methods for detecting and segmenting orofacial movements in clinical videos, demonstrating that a deep learning approach outperforms traditional feature-based methods in accuracy and clinical differentiation.
Contribution
It introduces a transformer-based deep learning model, RepNet, for automated temporal segmentation of orofacial movements, showing improved performance over landmark-based techniques.
Findings
RepNet achieved higher IoU scores than the landmark-based method.
RepNet better distinguished ALS from healthy controls based on repetition duration.
Deep learning approach outperforms engineered feature methods in this task.
Abstract
Computer vision techniques can help automate or partially automate clinical examination of orofacial impairments to provide accurate and objective assessments. Towards the development of such automated systems, we evaluated two approaches to detect and temporally segment (parse) repetitions in orofacial assessment videos. Recorded videos of participants with amyotrophic lateral sclerosis (ALS) and healthy control (HC) individuals were obtained from the Toronto NeuroFace Dataset. Two approaches for repetition detection and parsing were examined: one based on engineered features from tracked facial landmarks and peak detection in the distance between the vermilion-cutaneous junction of the upper and lower lips (baseline analysis), and another using a pre-trained transformer-based deep learning model called RepNet (Dwibedi et al, 2020), which automatically detects periodicity, and parses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFacial Nerve Paralysis Treatment and Research · Voice and Speech Disorders · Temporomandibular Joint Disorders
MethodsAdaptive Label Smoothing
