Robust face anti-spoofing framework with Convolutional Vision Transformer
Yunseung Lee, Youngjun Kwak, Jinho Shin

TL;DR
This paper introduces a convolutional vision transformer framework that enhances face anti-spoofing robustness against domain shifts by integrating global and local cues, outperforming existing models in cross-dataset tests.
Contribution
It is the first to explore combining global and local information via self-attention and convolutional layers for robust face anti-spoofing across unseen domains.
Findings
7.3% and 12.9% performance improvements over CNN and vision transformer models.
Achieved highest average rank in cross-dataset domain generalization benchmarks.
Demonstrated robustness against various unseen domain data.
Abstract
Owing to the advances in image processing technology and large-scale datasets, companies have implemented facial authentication processes, thereby stimulating increased focus on face anti-spoofing (FAS) against realistic presentation attacks. Recently, various attempts have been made to improve face recognition performance using both global and local learning on face images; however, to the best of our knowledge, this is the first study to investigate whether the robustness of FAS against domain shifts is improved by considering global information and local cues in face images captured using self-attention and convolutional layers. This study proposes a convolutional vision transformer-based framework that achieves robust performance for various unseen domain data. Our model resulted in 7.3% and 12.9% increases in FAS performance compared to models using only a convolutional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiometric Identification and Security · Face recognition and analysis
MethodsFocus
