Master Thesis: Neural Sign Language Translation by Learning Tokenization
Alptekin Orbay

TL;DR
This thesis introduces a multitask learning approach for neural sign language translation that develops a generic sign-level tokenization layer, improving translation quality and efficiency without relying on gloss annotations.
Contribution
It proposes a novel sign-level tokenization method leveraging transfer learning and multitask learning, outperforming traditional gloss-level approaches in sign language translation.
Findings
Achieved 5 BLEU-4 point improvement in translation quality
Enhanced efficiency using 3D-CNNs for faster processing
Demonstrated advantages of sign-level over gloss-level tokenization
Abstract
In this thesis, we propose a multitask learning based method to improve Neural Sign Language Translation (NSLT) consisting of two parts, a tokenization layer and Neural Machine Translation (NMT). The tokenization part focuses on how Sign Language (SL) videos should be represented to be fed into the other part. It has not been studied elaborately whereas NMT research has attracted several researchers contributing enormous advancements. Up to now, there are two main input tokenization levels, namely frame-level and gloss-level tokenization. Glosses are world-like intermediate presentation and unique to SLs. Therefore, we aim to develop a generic sign-level tokenization layer so that it is applicable to other domains without further effort. We begin with investigating current tokenization approaches and explain their weaknesses with several experiments. To provide a solution, we adapt…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Multimodal Machine Learning Applications
