Diverse Sign Language Translation

Xin Shen; Lei Shen; Shaozu Yuan; Heming Du; Haiyang Sun; Xin Yu

arXiv:2410.19586·cs.MM·October 28, 2024

Diverse Sign Language Translation

Xin Shen, Lei Shen, Shaozu Yuan, Heming Du, Haiyang Sun, Xin Yu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new task called Diverse Sign Language Translation (DivSLT) that generates multiple accurate and diverse textual translations from sign language videos, addressing the limitations of one-to-one translation models especially with limited data.

Contribution

The paper proposes the DivSLT task, creates a benchmark with multi-reference data, and develops models employing multi-reference training and reinforcement learning to improve diversity and accuracy.

Findings

01

DivSLT achieves more diverse translations without sacrificing accuracy.

02

The use of large language models improves reference quality and annotation efficiency.

03

Reinforcement learning enhances translation performance and diversity.

Abstract

Like spoken languages, a single sign language expression could correspond to multiple valid textual interpretations. Hence, learning a rigid one-to-one mapping for sign language translation (SLT) models might be inadequate, particularly in the case of limited data. In this work, we introduce a Diverse Sign Language Translation (DivSLT) task, aiming to generate diverse yet accurate translations for sign language videos. Firstly, we employ large language models (LLM) to generate multiple references for the widely-used CSL-Daily and PHOENIX14T SLT datasets. Here, native speakers are only invited to touch up inaccurate references, thus significantly improving the annotation efficiency. Secondly, we provide a benchmark model to spur research in this task. Specifically, we investigate multi-reference training strategies to enable our DivSLT model to achieve diverse translations. Then, to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

XinS0909/Diverse_Sign_Language_Translation
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Swearing, Euphemism, Multilingualism