Decoding visemes: improving machine lipreading

Helen L. Bear; Richard Harvey

arXiv:1710.01169·cs.CV·April 26, 2018

Decoding visemes: improving machine lipreading

Helen L. Bear, Richard Harvey

PDF

Open Access

TL;DR

This paper introduces a novel two-pass training method for phoneme classifiers in machine lip-reading, leveraging viseme training to significantly enhance classification accuracy over previous approaches.

Contribution

It presents a new training algorithm that uses viseme classifiers to improve phoneme classification in lip-reading systems.

Findings

01

Significant improvement in lip-reading classification performance.

02

Phoneme classification can outperform viseme classification under certain conditions.

03

The two-pass training method effectively leverages viseme information for better phoneme recognition.

Abstract

To undertake machine lip-reading, we try to recognise speech from a visual signal. Current work often uses viseme classification supported by language models with varying degrees of success. A few recent works suggest phoneme classification, in the right circumstances, can outperform viseme classification. In this work we present a novel two-pass method of training phoneme classifiers which uses previously trained visemes in the first pass. With our new training algorithm, we show classification performance which significantly improves on previous lip-reading results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Phonetics and Phonology Research · Hearing Impairment and Communication