Towards Skeletal and Signer Noise Reduction in Sign Language Production via Quaternion-Based Pose Encoding and Contrastive Learning
Guilhem Faur\'e (MULTISPEECH), Mostafa Sadeghi (MULTISPEECH), Sam Bigeard (MULTISPEECH), Slim Ouni (LORIA, MULTISPEECH)

TL;DR
This paper enhances neural sign language production by integrating quaternion-based pose encoding and contrastive learning, significantly improving pose accuracy and semantic consistency in sign language synthesis.
Contribution
It introduces quaternion pose encoding with geodesic loss and a contrastive loss for semantic structuring, advancing robustness in sign language production models.
Findings
16% improvement in Probability of Correct Keypoint with contrastive loss
6% reduction in Mean Bone Angle Error with quaternion encoding
Enhanced robustness to signer variability in sign language synthesis
Abstract
One of the main challenges in neural sign language production (SLP) lies in the high intra-class variability of signs, arising from signer morphology and stylistic variety in the training data. To improve robustness to such variations, we propose two enhancements to the standard Progressive Transformers (PT) architecture (Saunders et al., 2020). First, we encode poses using bone rotations in quaternion space and train with a geodesic loss to improve the accuracy and clarity of angular joint movements. Second, we introduce a contrastive loss to structure decoder embeddings by semantic similarity, using either gloss overlap or SBERT-based sentence similarity, aiming to filter out anatomical and stylistic features that do not convey relevant semantic information. On the Phoenix14T dataset, the contrastive loss alone yields a 16% improvement in Probability of Correct Keypoint over the PT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Robot Manipulation and Learning
