RAL:Redundancy-Aware Lipreading Model Based on Differential Learning with Symmetric Views
Zejun gu, Junxia jiang

TL;DR
This paper introduces a novel lipreading model that leverages differential learning of lip symmetry, reduces redundant information, and enhances view interactions, leading to improved recognition performance.
Contribution
It proposes a differential learning strategy, a redundancy-aware operation, and an adaptive cross-view interaction module for more effective lipreading.
Findings
Effective on LRW and LRW-1000 datasets
Improves recognition accuracy
Utilizes asymmetric lip features
Abstract
Lip reading involves interpreting a speaker's speech by analyzing sequences of lip movements. Currently, most models regard the left and right halves of the lips as a symmetrical whole, lacking a thorough investigation of their differences. However, the left and right halves of the lips are not always symmetrical, and the subtle differences between them contain rich semantic information. In this paper, we propose a differential learning strategy with symmetric views (DLSV) to address this issue. Additionally, input images often contain a lot of redundant information unrelated to recognition results, which can degrade the model's performance. We present a redundancy-aware operation (RAO) to reduce it. Finally, to leverage the relational information between symmetric views and within each view, we further design an adaptive cross-view interaction module (ACVI). Experiments on LRW and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Web Data Mining and Analysis · Web Applications and Data Management
