WhisperNetV2: SlowFast Siamese Network For Lip-Based Biometrics

Abdollah Zakeri; Hamid Hassanpour; Mohammad Hossein Khosravi; Amir; Masoud Nourollah

arXiv:2407.08717·cs.CV·July 12, 2024

WhisperNetV2: SlowFast Siamese Network For Lip-Based Biometrics

Abdollah Zakeri, Hamid Hassanpour, Mohammad Hossein Khosravi, Amir, Masoud Nourollah

PDF

Open Access

TL;DR

WhisperNetV2 introduces a SlowFast Siamese network for lip-based biometrics, effectively capturing physiological and behavioral lip features, and achieves state-of-the-art accuracy by considering emotional variations during video acquisition.

Contribution

The paper proposes WhisperNetV2, a novel deep Siamese network with SlowFast architecture that improves lip-based biometric authentication by incorporating emotional variability and triplet loss.

Findings

01

Achieved an EER of 0.005 on CREMA-D dataset.

02

Outperforms most existing LBBA methods in accuracy.

03

Utilizes a dual-path SlowFast network to capture different lip features.

Abstract

Lip-based biometric authentication (LBBA) has attracted many researchers during the last decade. The lip is specifically interesting for biometric researchers because it is a twin biometric with the potential to function both as a physiological and a behavioral trait. Although much valuable research was conducted on LBBA, none of them considered the different emotions of the client during the video acquisition step of LBBA, which can potentially affect the client's facial expressions and speech tempo. We proposed a novel network structure called WhisperNetV2, which extends our previously proposed network called WhisperNet. Our proposed network leverages a deep Siamese structure with triplet loss having three identical SlowFast networks as embedding networks. The SlowFast network is an excellent candidate for our task since the fast pathway extracts motion-related features (behavioral…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiometric Identification and Security · Face recognition and analysis · Speech and Audio Processing

MethodsTriplet Loss