Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign Language

Dinh Nam Pham; Eleftherios Avramidis

arXiv:2505.13784·cs.CV·August 12, 2025

Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign Language

Dinh Nam Pham, Eleftherios Avramidis

PDF

1 Repo

TL;DR

This paper explores transfer learning from Visual Speech Recognition to mouthing recognition in German Sign Language, demonstrating improved accuracy and robustness through multi-task learning with limited mouthing annotations.

Contribution

It introduces a novel transfer learning approach from VSR to mouthing recognition in GSL, highlighting the benefits of multi-task learning for SLR.

Findings

01

Multi-task learning enhances mouthing recognition accuracy.

02

Transfer learning from VSR improves model robustness.

03

Using related datasets boosts performance with limited annotations.

Abstract

Sign Language Recognition (SLR) systems primarily focus on manual gestures, but non-manual features such as mouth movements, specifically mouthing, provide valuable linguistic information. This work directly classifies mouthing instances to their corresponding words in the spoken language while exploring the potential of transfer learning from Visual Speech Recognition (VSR) to mouthing recognition in German Sign Language. We leverage three VSR datasets: one in English, one in German with unrelated words and one in German containing the same target words as the mouthing dataset, to investigate the impact of task similarity in this setting. Our results demonstrate that multi-task learning improves both mouthing recognition and VSR accuracy as well as model robustness, suggesting that mouthing recognition should be treated as a distinct but related task to VSR. This research contributes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nphamdinh/transfer-learning-vsr-mouthing-sign-language
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus · Surrogate Lagrangian Relaxation