WhisperNetV2: SlowFast Siamese Network For Lip-Based Biometrics
Abdollah Zakeri, Hamid Hassanpour, Mohammad Hossein Khosravi, Amir, Masoud Nourollah

TL;DR
WhisperNetV2 introduces a SlowFast Siamese network for lip-based biometrics, effectively capturing physiological and behavioral lip features, and achieves state-of-the-art accuracy by considering emotional variations during video acquisition.
Contribution
The paper proposes WhisperNetV2, a novel deep Siamese network with SlowFast architecture that improves lip-based biometric authentication by incorporating emotional variability and triplet loss.
Findings
Achieved an EER of 0.005 on CREMA-D dataset.
Outperforms most existing LBBA methods in accuracy.
Utilizes a dual-path SlowFast network to capture different lip features.
Abstract
Lip-based biometric authentication (LBBA) has attracted many researchers during the last decade. The lip is specifically interesting for biometric researchers because it is a twin biometric with the potential to function both as a physiological and a behavioral trait. Although much valuable research was conducted on LBBA, none of them considered the different emotions of the client during the video acquisition step of LBBA, which can potentially affect the client's facial expressions and speech tempo. We proposed a novel network structure called WhisperNetV2, which extends our previously proposed network called WhisperNet. Our proposed network leverages a deep Siamese structure with triplet loss having three identical SlowFast networks as embedding networks. The SlowFast network is an excellent candidate for our task since the fast pathway extracts motion-related features (behavioral…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiometric Identification and Security · Face recognition and analysis · Speech and Audio Processing
MethodsTriplet Loss
