Face Presentation Attack Detection via Content-Adaptive Spatial Operators
Shujaat Khan

TL;DR
This paper introduces CASO-PAD, a lightweight, content-adaptive spatial operator-based model for face presentation attack detection that achieves high accuracy and robustness on multiple benchmarks using only single RGB images.
Contribution
It proposes a novel, efficient spatial operator that enhances MobileNetV3 for improved spoof cue localization without extra sensors or temporal data.
Findings
Achieves near-perfect accuracy and AUC on multiple datasets.
Demonstrates robustness on large-scale SiW-Mv2 benchmark.
Maintains low computational cost suitable for mobile devices.
Abstract
Face presentation attack detection (FacePAD) is critical for securing facial authentication against print, replay, and mask-based spoofing. This paper proposes CASO-PAD, an RGB-only, single-frame model that enhances MobileNetV3 with content-adaptive spatial operators (involution) to better capture localized spoof cues. Unlike spatially shared convolution kernels, the proposed operator generates location-specific, channel-shared kernels conditioned on the input, improving spatial selectivity with minimal overhead. CASO-PAD remains lightweight (3.6M parameters; 0.64 GFLOPs at ) and is trained end-to-end using a standard binary cross-entropy objective. Extensive experiments on Replay-Attack, Replay-Mobile, ROSE-Youtu, and OULU-NPU demonstrate strong performance, achieving 100/100/98.9/99.7\% test accuracy, AUC of 1.00/1.00/0.9995/0.9999, and HTER of 0.00/0.00/0.82/0.44\%,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiometric Identification and Security · Face recognition and analysis · User Authentication and Security Systems
