FMVP: Masked Flow Matching for Adversarial Video Purification

Duoxun Tang; Xueyi Zhang; Chak Hin Wang; Xi Xiao; Dasen Dai; Xinhang Jiang; Wentao Shi; Rui Li; Qing Li

arXiv:2601.02228·cs.CV·January 13, 2026

FMVP: Masked Flow Matching for Adversarial Video Purification

Duoxun Tang, Xueyi Zhang, Chak Hin Wang, Xi Xiao, Dasen Dai, Xinhang Jiang, Wentao Shi, Rui Li, Qing Li

PDF

Open Access

TL;DR

FMVP introduces a novel video purification method that physically shatters adversarial structures and reconstructs clean videos using flow matching and frequency gating, significantly improving robustness against various attacks.

Contribution

The paper proposes FMVP, a new adversarial video purification technique combining physical shattering, conditional flow matching, and frequency gating, with training paradigms for known and unknown threats.

Findings

01

Outperforms state-of-the-art methods in robustness against PGD and CW attacks.

02

Achieves over 87% robust accuracy against PGD and 89% against CW.

03

Effective as a zero-shot adversarial detector with high AUC-ROC scores.

Abstract

Video recognition models remain vulnerable to adversarial attacks, while existing diffusion-based purification methods suffer from inefficient sampling and curved trajectories. Directly regressing clean videos from adversarial inputs often fails to recover faithful content due to the subtle nature of perturbations; this necessitates physically shattering the adversarial structure. Therefore, we propose Flow Matching for Adversarial Video Purification FMVP. FMVP physically shatters global adversarial structures via a masking strategy and reconstructs clean video dynamics using Conditional Flow Matching (CFM) with an inpainting objective. To further decouple semantic content from adversarial noise, we design a Frequency-Gated Loss (FGL) that explicitly suppresses high-frequency adversarial residuals while preserving low-frequency fidelity. We design Attack-Aware and Generalist training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition