Independent Sign Language Recognition with 3D Body, Hands, and Face Reconstruction
Agelos Kratimenos, Georgios Pavlakos, Petros Maragos

TL;DR
This paper introduces a holistic 3D reconstruction approach using SMPL-X for Sign Language Recognition, combining body, face, and hand features to improve accuracy over traditional methods that use raw images or 2D skeletons.
Contribution
The work is the first to effectively combine 3D body, face, and hand information for Sign Language Recognition using SMPL-X, demonstrating improved accuracy.
Findings
Holistic 3D features outperform raw RGB and 2D skeleton methods.
Neglecting body, face, or hands reduces recognition accuracy.
Joint modeling of all three features is crucial for high performance.
Abstract
Independent Sign Language Recognition is a complex visual recognition problem that combines several challenging tasks of Computer Vision due to the necessity to exploit and fuse information from hand gestures, body features and facial expressions. While many state-of-the-art works have managed to deeply elaborate on these features independently, to the best of our knowledge, no work has adequately combined all three information channels to efficiently recognize Sign Language. In this work, we employ SMPL-X, a contemporary parametric model that enables joint extraction of 3D body shape, face and hands information from a single image. We use this holistic 3D reconstruction for SLR, demonstrating that it leads to higher accuracy than recognition from raw RGB images and their optical flow fed into the state-of-the-art I3D-type network for 3D action recognition and from 2D Openpose skeletons…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsOpenPose
