Variational Bayesian Sequence-to-Sequence Networks for Memory-Efficient Sign Language Translation
Harris Partaourides, Andreas Voskou, Dimitrios Kosmopoulos, Sotirios, Chatzis, and Dimitris N. Metaxas

TL;DR
This paper introduces a variational Bayesian sequence-to-sequence network architecture for memory-efficient sign language translation, combining Gaussian posteriors and Indian Buffet Process priors to achieve significant weight compression without performance loss.
Contribution
It proposes a novel Stick-Breaking Recurrent network that enhances memory efficiency using Bayesian methods and nonparametric priors for sign language translation.
Findings
Achieves substantial weight compression
Maintains modeling performance despite compression
Introduces a new Bayesian recurrent architecture
Abstract
Memory-efficient continuous Sign Language Translation is a significant challenge for the development of assisted technologies with real-time applicability for the deaf. In this work, we introduce a paradigm of designing recurrent deep networks whereby the output of the recurrent layer is derived from appropriate arguments from nonparametric statistics. A novel variational Bayesian sequence-to-sequence network architecture is proposed that consists of a) a full Gaussian posterior distribution for data-driven memory compression and b) a nonparametric Indian Buffet Process prior for regularization applied on the Gated Recurrent Unit non-gate weights. We dub our approach Stick-Breaking Recurrent network and show that it can achieve a substantial weight compression without diminishing modeling performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Multimodal Machine Learning Applications
