SignSparK: Efficient Multilingual Sign Language Production via Sparse Keyframe Learning

Jianhe Low; Alexandre Symeonidis-Herzig; Maksym Ivashechkin; Ozge Mercanoglu Sincan; Richard Bowden

arXiv:2603.10446·cs.CV·March 24, 2026

SignSparK: Efficient Multilingual Sign Language Production via Sparse Keyframe Learning

Jianhe Low, Alexandre Symeonidis-Herzig, Maksym Ivashechkin, Ozge Mercanoglu Sincan, Richard Bowden

PDF

Open Access

TL;DR

SignSparK introduces a scalable, multilingual sign language production framework that uses sparse keyframes and advanced modeling to generate realistic, fluid signing sequences with precise editing capabilities.

Contribution

The paper presents a novel keyframe-based training paradigm and a large-scale CFM framework for efficient, high-fidelity multilingual sign language synthesis.

Findings

01

Achieves state-of-the-art results across multiple sign language benchmarks.

02

Enables high-fidelity, editable sign language generation in fewer than ten steps.

03

Scales to four different sign languages, establishing the largest multilingual SLP framework to date.

Abstract

Generating natural and linguistically accurate sign language avatars remains a formidable challenge. Current Sign Language Production (SLP) frameworks face a stark trade-off: direct text-to-pose models suffer from regression-to-the-mean effects, while dictionary-retrieval methods produce robotic, disjointed transitions. To resolve this, we propose a novel training paradigm that leverages sparse keyframes to capture the true underlying kinematic distribution of human signing. By predicting dense motion from these discrete anchors, our approach mitigates regression-to-the-mean while ensuring fluid articulation. To realize this paradigm at scale, we first introduce FAST, an ultra-efficient sign segmentation model that automatically mines precise temporal boundaries. We then present SignSparK, a large-scale Conditional Flow Matching (CFM) framework that utilizes these extracted anchors to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Human Motion and Animation · Social Robot Interaction and HRI