Lip Localization and Viseme Classification for Visual Speech Recognition
Salah Werda, Walid Mahdi, Abdelmajid Ben Hamadou

TL;DR
This paper discusses the development of a visual speech recognition system focusing on lip localization and viseme classification to improve lip-reading technology for multimedia applications and assistive communication.
Contribution
It introduces methods for accurate lip localization and viseme classification to enhance automatic lip-reading systems.
Findings
Improved lip localization accuracy
Enhanced viseme classification performance
Potential applications in assistive communication
Abstract
The need for an automatic lip-reading system is ever increasing. Infact, today, extraction and reliable analysis of facial movements make up an important part in many multimedia systems such as videoconference, low communication systems, lip-reading systems. In addition, visual information is imperative among people with special needs. We can imagine, for example, a dependent person ordering a machine with an easy lip movement or by a simple syllable pronunciation. Moreover, people with hearing problems compensate for their special needs by lip-reading as well as listening to the person with whome they are talking.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Face recognition and analysis · Indoor and Outdoor Localization Technologies
