Multi-Scale Wavelet Transformer for Face Forgery Detection
Jie Liu, Jingjing Wang, Peng Zhang, Chunmao Wang, Di Xie, Shiliang Pu

TL;DR
This paper introduces a multi-scale wavelet transformer framework that enhances face forgery detection by integrating multi-frequency and multi-scale features with novel attention mechanisms, improving cross-dataset performance.
Contribution
It proposes a multi-scale wavelet transformer with frequency-based spatial and cross-modality attention modules for improved face forgery detection.
Findings
Effective in cross-dataset scenarios
Outperforms existing methods in accuracy
Efficient fusion of spatial and frequency features
Abstract
Currently, many face forgery detection methods aggregate spatial and frequency features to enhance the generalization ability and gain promising performance under the cross-dataset scenario. However, these methods only leverage one level frequency information which limits their expressive ability. To overcome these limitations, we propose a multi-scale wavelet transformer framework for face forgery detection. Specifically, to take full advantage of the multi-scale and multi-frequency wavelet representation, we gradually aggregate the multi-scale wavelet representation at different stages of the backbone network. To better fuse the frequency feature with the spatial features, frequency-based spatial attention is designed to guide the spatial feature extractor to concentrate more on forgery traces. Meanwhile, cross-modality attention is proposed to fuse the frequency features with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Biometric Identification and Security · Digital Media Forensic Detection
