SignSparK: Efficient Multilingual Sign Language Production via Sparse Keyframe Learning
Jianhe Low, Alexandre Symeonidis-Herzig, Maksym Ivashechkin, Ozge Mercanoglu Sincan, Richard Bowden

TL;DR
SignSparK introduces a scalable, multilingual sign language production framework that uses sparse keyframes and advanced modeling to generate realistic, fluid signing sequences with precise editing capabilities.
Contribution
The paper presents a novel keyframe-based training paradigm and a large-scale CFM framework for efficient, high-fidelity multilingual sign language synthesis.
Findings
Achieves state-of-the-art results across multiple sign language benchmarks.
Enables high-fidelity, editable sign language generation in fewer than ten steps.
Scales to four different sign languages, establishing the largest multilingual SLP framework to date.
Abstract
Generating natural and linguistically accurate sign language avatars remains a formidable challenge. Current Sign Language Production (SLP) frameworks face a stark trade-off: direct text-to-pose models suffer from regression-to-the-mean effects, while dictionary-retrieval methods produce robotic, disjointed transitions. To resolve this, we propose a novel training paradigm that leverages sparse keyframes to capture the true underlying kinematic distribution of human signing. By predicting dense motion from these discrete anchors, our approach mitigates regression-to-the-mean while ensuring fluid articulation. To realize this paradigm at scale, we first introduce FAST, an ultra-efficient sign segmentation model that automatically mines precise temporal boundaries. We then present SignSparK, a large-scale Conditional Flow Matching (CFM) framework that utilizes these extracted anchors to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Human Motion and Animation · Social Robot Interaction and HRI
