Simultaneous Segmentation and Recognition: Towards more accurate Ego Gesture Recognition
Tejo Chalasani, Aljosa Smolic

TL;DR
This paper introduces a deep neural network that simultaneously segments and recognizes ego hand gestures from video sequences, achieving higher accuracy than previous methods by leveraging embeddings from RGB images.
Contribution
It presents a novel approach combining segmentation and recognition in a single network for ego hand gestures, improving accuracy on a large public dataset.
Findings
Achieved 96.9% recognition accuracy, surpassing the previous 92.2%.
Introduced a unified network architecture for segmentation and recognition.
Demonstrated effectiveness on the EgoGesture dataset.
Abstract
Ego hand gestures can be used as an interface in AR and VR environments. While the context of an image is important for tasks like scene understanding, object recognition, image caption generation and activity recognition, it plays a minimal role in ego hand gesture recognition. An ego hand gesture used for AR and VR environments conveys the same information regardless of the background. With this idea in mind, we present our work on ego hand gesture recognition that produces embeddings from RBG images with ego hands, which are simultaneously used for ego hand segmentation and ego gesture recognition. To this extent, we achieved better recognition accuracy (96.9%) compared to the state of the art (92.2%) on the biggest ego hand gesture dataset available publicly. We present a gesture recognition deep neural network which recognises ego hand gestures from videos (videos containing a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
