Real-Time Sign Language Detection using Human Pose Estimation
Amit Moryossef, Ioannis Tsochantaridis, Roee Aharoni, Sarah Ebling,, and Srini Narayanan

TL;DR
This paper presents a lightweight, real-time sign language detection system leveraging human pose estimation and optical flow features, achieving high accuracy and low latency suitable for videoconferencing.
Contribution
It introduces a novel real-time sign language detection model that combines pose estimation with optical flow and demonstrates its effectiveness in practical applications.
Findings
Linear classifier achieves 80% accuracy with optical flow features.
Recurrent model improves accuracy to 91%.
System operates under 4ms latency.
Abstract
We propose a lightweight real-time sign language detection model, as we identify the need for such a case in videoconferencing. We extract optical flow features based on human pose estimation and, using a linear classifier, show these features are meaningful with an accuracy of 80%, evaluated on the DGS Corpus. Using a recurrent model directly on the input, we see improvements of up to 91% accuracy, while still working under 4ms. We describe a demo application to sign language detection in the browser in order to demonstrate its usage possibility in videoconferencing applications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
