Scaling Back-Translation with Domain Text Generation for Sign Language Gloss Translation
Jinhui Ye, Wenxiang Jiao, Xing Wang, Zhaopeng Tu

TL;DR
This paper introduces PGEN, a prompt-based method to generate large-scale in-domain spoken language texts, enhancing back translation for sign language gloss translation and significantly improving performance across multiple benchmarks.
Contribution
The paper proposes a novel PGEN approach that uses GPT-2 to generate in-domain spoken language data, addressing data scarcity in sign language translation tasks.
Findings
PGEN-generated data improves back translation performance
Scaling PGEN data leads to further translation accuracy gains
Method outperforms existing approaches on multiple benchmarks
Abstract
Sign language gloss translation aims to translate the sign glosses into spoken language texts, which is challenging due to the scarcity of labeled gloss-text parallel data. Back translation (BT), which generates pseudo-parallel data by translating in-domain spoken language texts into sign glosses, has been applied to alleviate the data scarcity problem. However, the lack of large-scale high-quality domain spoken language text data limits the effect of BT. In this paper, to overcome the limitation, we propose a Prompt based domain text Generation (PGEN) approach to produce the large-scale in-domain spoken language text data. Specifically, PGEN randomly concatenates sentences from the original in-domain spoken language text data as prompts to induce a pre-trained language model (i.e., GPT-2) to generate spoken language texts in a similar style. Experimental results on three benchmarks of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Natural Language Processing Techniques · Hearing Impairment and Communication
