Sign Language Translation with Sentence Embedding Supervision

Yasser Hamidullah; Josef van Genabith; Cristina Espa\~na-Bonet

arXiv:2510.19367·cs.CL·October 23, 2025

Sign Language Translation with Sentence Embedding Supervision

Yasser Hamidullah, Josef van Genabith, Cristina Espa\~na-Bonet

PDF

1 Video

TL;DR

This paper introduces a novel sign language translation method that uses sentence embeddings as supervision, eliminating the need for manual gloss annotations and enabling effective multilingual translation.

Contribution

It presents a new gloss-free training approach using sentence embeddings, improving sign language translation performance without relying on annotated gloss data.

Findings

01

Outperforms existing gloss-free methods significantly.

02

Sets new state-of-the-art on datasets without gloss annotations.

03

Reduces the gap between gloss-dependent and gloss-free systems.

Abstract

State-of-the-art sign language translation (SLT) systems facilitate the learning process through gloss annotations, either in an end2end manner or by involving an intermediate step. Unfortunately, gloss labelled sign language data is usually not available at scale and, when available, gloss annotations widely differ from dataset to dataset. We present a novel approach using sentence embeddings of the target sentences at training time that take the role of glosses. The new kind of supervision does not need any manual annotation but it is learned on raw textual data. As our approach easily facilitates multilinguality, we evaluate it on datasets covering German (PHOENIX-2014T) and American (How2Sign) sign languages and experiment with mono- and multilingual sentence embeddings and translation systems. Our approach significantly outperforms other gloss-free approaches, setting the new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Sign Language Translation with Sentence Embedding Supervision· underline