Generalized Single-Image-Based Morphing Attack Detection Using Deep Representations from Vision Transformer
Haoyu Zhang, Raghavendra Ramachandra, Kiran Raja, Christoph Busch

TL;DR
This paper introduces a novel single-image morphing attack detection method using Vision Transformer representations, demonstrating improved generalization and detection performance over existing CNN-based approaches in face recognition security.
Contribution
The paper presents a generalized single-image MAD algorithm based on ViT, which effectively captures local and global face features for robust morphing attack detection.
Findings
Improved detection performance on inter-dataset tests.
Comparable performance on intra-dataset tests.
Outperforms state-of-the-art CNN-based MAD methods.
Abstract
Face morphing attacks have posed severe threats to Face Recognition Systems (FRS), which are operated in border control and passport issuance use cases. Correspondingly, morphing attack detection algorithms (MAD) are needed to defend against such attacks. MAD approaches must be robust enough to handle unknown attacks in an open-set scenario where attacks can originate from various morphing generation algorithms, post-processing and the diversity of printers/scanners. The problem of generalization is further pronounced when the detection has to be made on a single suspected image. In this paper, we propose a generalized single-image-based MAD (S-MAD) algorithm by learning the encoding from Vision Transformer (ViT) architecture. Compared to CNN-based architectures, ViT model has the advantage on integrating local and global information and hence can be suitable to detect the morphing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis
MethodsAttention Is All You Need · Adam · Residual Connection · Dropout · Softmax · Byte Pair Encoding · Linear Layer · Absolute Position Encodings · Vision Transformer · Multi-Head Attention
