Visual gesture variability between talkers in continuous visual speech

Helen L Bear

arXiv:1710.01297·cs.CV·April 26, 2018·5 cites

Visual gesture variability between talkers in continuous visual speech

Helen L Bear

PDF

Open Access

TL;DR

This paper investigates how visual gesture variability between talkers affects continuous speech lipreading, revealing that viseme trajectories significantly influence speaker differentiation and system performance.

Contribution

It extends prior work from isolated words to continuous speech, analyzing the impact of viseme trajectories on speaker-dependent lipreading systems.

Findings

01

Viseme trajectory variability impacts speaker differentiation in continuous speech.

02

Continuous speech poses greater challenges than isolated words for lipreading systems.

03

Speaker-dependent viseme mappings are influenced by gesture variability.

Abstract

Recent adoption of deep learning methods to the field of machine lipreading research gives us two options to pursue to improve system performance. Either, we develop end-to-end systems holistically or, we experiment to further our understanding of the visual speech signal. The latter option is more difficult but this knowledge would enable researchers to both improve systems and apply the new knowledge to other domains such as speech therapy. One challenge in lipreading systems is the correct labeling of the classifiers. These labels map an estimated function between visemes on the lips and the phonemes uttered. Here we ask if such maps are speaker-dependent? Prior work investigated isolated word recognition from speaker-dependent (SD) visemes, we extend this to continuous speech. Benchmarked against SD results, and the isolated words performance, we test with RMAV dataset speakers and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Face recognition and analysis · Music and Audio Processing