Visual Prompt Flexible-Modal Face Anti-Spoofing
Zitong Yu, Rizhao Cai, Yawen Cui, Ajian Liu, Changsheng Chen

TL;DR
This paper introduces VP-FAS, a flexible-modal face anti-spoofing framework that uses visual prompts and regularization to handle missing modalities during training and testing, improving robustness with minimal additional parameters.
Contribution
It proposes a novel prompt learning approach for flexible-modal FAS that effectively manages missing modalities with fewer learnable parameters and enhances feature consistency.
Findings
Improves FAS performance under various missing-modality scenarios.
Requires less than 4% additional learnable parameters.
Demonstrates effectiveness on benchmark datasets.
Abstract
Recently, vision transformer based multimodal learning methods have been proposed to improve the robustness of face anti-spoofing (FAS) systems. However, multimodal face data collected from the real world is often imperfect due to missing modalities from various imaging sensors. Recently, flexible-modal FAS~\cite{yu2023flexible} has attracted more attention, which aims to develop a unified multimodal FAS model using complete multimodal face data but is insensitive to test-time missing modalities. In this paper, we tackle one main challenge in flexible-modal FAS, i.e., when missing modality occurs either during training or testing in real-world situations. Inspired by the recent success of the prompt learning in language models, we propose \textbf{V}isual \textbf{P}rompt flexible-modal \textbf{FAS} (VP-FAS), which learns the modal-relevant prompts to adapt the frozen pre-trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiometric Identification and Security · Reconstructive Facial Surgery Techniques · Face recognition and analysis
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Residual Connection · Layer Normalization · Dense Connections · Vision Transformer
