Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale
Aditya Agarwal, Bipasha Sen, Rudrabha Mukhopadhyay, Vinay Namboodiri,, C.V Jawahar

TL;DR
This paper explores using synthetic talking head videos generated by AI to create scalable, diverse, and cost-effective online lipreading training resources, addressing current limitations of manual video creation.
Contribution
The authors develop an end-to-end automated pipeline combining AI-generated talking heads, text-to-speech, and computer vision to produce lipreading training content at scale.
Findings
Synthetic videos are effective for lipreading training.
The platform can incorporate diverse vocabularies and accents.
Human evaluation shows comparable quality to real videos.
Abstract
Many people with some form of hearing loss consider lipreading as their primary mode of day-to-day communication. However, finding resources to learn or improve one's lipreading skills can be challenging. This is further exacerbated in the COVID19 pandemic due to restrictions on direct interactions with peers and speech therapists. Today, online MOOCs platforms like Coursera and Udemy have become the most effective form of training for many types of skill development. However, online lipreading resources are scarce as creating such resources is an extensive process needing months of manual effort to record hired actors. Because of the manual pipeline, such platforms are also limited in vocabulary, supported languages, accents, and speakers and have a high usage cost. In this work, we investigate the possibility of replacing real human talking videos with synthetically generated videos.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale· youtube
Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale· youtube
Taxonomy
TopicsHearing Impairment and Communication · Speech and Audio Processing · Subtitles and Audiovisual Media
