Exploring Active Data Selection Strategies for Continuous Training in Deepfake Detection
Yoshihiko Furuhashi, Junichi Yamagishi, Xin Wang, Huy H. Nguyen, Isao, Echizen

TL;DR
This paper introduces an active data selection method for continuous deepfake detection training, which efficiently improves model performance by selecting minimal new data from a large pool based on confidence scores.
Contribution
It proposes an automatic data selection strategy that enhances deepfake detection models during continuous training with minimal additional data.
Findings
Achieved an EER of 2.5% with only 15% of the pool data.
Significantly improved detection performance with small, automatically selected data.
Efficiently updates models using confidence-based data selection.
Abstract
In deepfake detection, it is essential to maintain high performance by adjusting the parameters of the detector as new deepfake methods emerge. In this paper, we propose a method to automatically and actively select the small amount of additional data required for the continuous training of deepfake detection models in situations where deepfake detection models are regularly updated. The proposed method automatically selects new training data from a \textit{redundant} pool set containing a large number of images generated by new deepfake methods and real images, using the confidence score of the deepfake detection model as a metric. Experimental results show that the deepfake detection model, continuously trained with a small amount of additional data automatically selected and added to the original training set, significantly and efficiently improved the detection performance,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection
MethodsSparse Evolutionary Training
