Tutor-Student Reinforcement Learning: A Dynamic Curriculum for Robust Deepfake Detection

Zhanhe Lei; Zhongyuan Wang; Jikang Cheng; Baojin Huang; Yuhong Yang; Zhen Han; Chao Liang; Dengpan Ye

arXiv:2603.24139·cs.CV·May 21, 2026

Tutor-Student Reinforcement Learning: A Dynamic Curriculum for Robust Deepfake Detection

Zhanhe Lei, Zhongyuan Wang, Jikang Cheng, Baojin Huang, Yuhong Yang, Zhen Han, Chao Liang, Dengpan Ye

PDF

1 Repo

TL;DR

This paper introduces a Tutor-Student Reinforcement Learning framework that dynamically optimizes training curriculum for deepfake detection, leading to improved generalization and robustness.

Contribution

It proposes a novel RL-based curriculum learning method with a Tutor agent guiding a deepfake detector during training, which is a new approach in this domain.

Findings

01

Adaptive curriculum improves detection accuracy on unseen deepfake techniques.

02

The Tutor effectively prioritizes hard-but-learnable samples during training.

03

Code implementation is publicly available at the provided GitHub link.

Abstract

Standard supervised training for deepfake detection treats all samples with uniform importance, which can be suboptimal for learning robust and generalizable features. In this work, we propose a novel Tutor-Student Reinforcement Learning (TSRL) framework to dynamically optimize the training curriculum. Our method models the training process as a Markov Decision Process where a ``Tutor'' agent learns to guide a ``Student'' (the deepfake detector). The Tutor, implemented as a Proximal Policy Optimization (PPO) agent, observes a rich state representation for each training sample, encapsulating not only its visual features but also its historical learning dynamics, such as EMA loss and forgetting counts. Based on this state, the Tutor takes an action by assigning a continuous weight (0-1) to the sample's loss, thereby dynamically re-weighting the training batch. The Tutor is rewarded based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wannac1/TSRL
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning