Visual Speech Recognition

Ahmad B. A. Hassanat

arXiv:1409.1411·cs.CV·September 5, 2014

Visual Speech Recognition

Ahmad B. A. Hassanat

PDF

TL;DR

Visual speech recognition automates lip reading using computer vision and AI, enabling applications like HCI, speaker recognition, and sign language interpretation, especially aiding those with hearing impairments.

Contribution

This paper reviews recent advances in automating lip reading through visual speech recognition, highlighting its techniques and potential applications.

Findings

01

Significant progress in AI-based lip reading techniques

02

Enhanced accuracy in visual speech recognition systems

03

Potential for diverse applications like HCI and surveillance

Abstract

Lip reading is used to understand or interpret speech without hearing it, a technique especially mastered by people with hearing difficulties. The ability to lip read enables a person with a hearing impairment to communicate with others and to engage in social activities, which otherwise would be difficult. Recent advances in the fields of computer vision, pattern recognition, and signal processing has led to a growing interest in automating this challenging task of lip reading. Indeed, automating the human ability to lip read, a process referred to as visual speech recognition (VSR) (or sometimes speech reading), could open the door for other novel related applications. VSR has received a great deal of attention in the last decade for its potential use in applications such as human-computer interaction (HCI), audio-visual speech recognition (AVSR), speaker recognition, talking heads,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.