Stochastic Transformer Networks with Linear Competing Units: Application   to end-to-end SL Translation

Andreas Voskou; Konstantinos P. Panousis; Dimitrios Kosmopoulos,; Dimitris N. Metaxas; Sotirios Chatzis

arXiv:2109.13318·cs.CL·October 4, 2021

Stochastic Transformer Networks with Linear Competing Units: Application to end-to-end SL Translation

Andreas Voskou, Konstantinos P. Panousis, Dimitrios Kosmopoulos,, Dimitris N. Metaxas, Sotirios Chatzis

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel end-to-end sign language translation model that eliminates the need for gloss annotations, using stochastic transformer layers with winner-takes-all units, achieving state-of-the-art BLEU-4 scores with reduced memory usage.

Contribution

The paper presents a new Transformer-based SLT model that does not require gloss groundtruth, employing stochastic winner-takes-all layers and variational inference for weights, with efficient compression at inference.

Findings

01

Achieved top BLEU-4 score on PHOENIX 2014T without gloss supervision.

02

Reduced memory footprint by over 70%.

03

Demonstrated effective stochastic layer integration in Transformer networks.

Abstract

Automating sign language translation (SLT) is a challenging real world application. Despite its societal importance, though, research progress in the field remains rather poor. Crucially, existing methods that yield viable performance necessitate the availability of laborious to obtain gloss sequence groundtruth. In this paper, we attenuate this need, by introducing an end-to-end SLT model that does not entail explicit use of glosses; the model only needs text groundtruth. This is in stark contrast to existing end-to-end models that use gloss sequence groundtruth, either in the form of a modality that is recognized at an intermediate model stage, or in the form of a parallel output process, jointly trained with the SLT model. Our approach constitutes a Transformer network with a novel type of layers that combines: (i) local winner-takes-all (LWTA) layers with stochastic winner sampling,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

avoskou/Stochastic-Transformer-Networks-with-Linear-Competing-Units-Application-to-end-to-end-SL-Translatio
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Robot Manipulation and Learning

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Layer Normalization · Position-Wise Feed-Forward Layer · Adam · Dense Connections · Byte Pair Encoding · Label Smoothing