ReToMe-VA: Recursive Token Merging for Video Diffusion-based   Unrestricted Adversarial Attack

Ziyi Gao; Kai Chen; Zhipeng Wei; Tingshu Mou; Jingjing Chen; Zhiyu; Tan; Hao Li; Yu-Gang Jiang

arXiv:2408.05479·cs.CV·August 13, 2024

ReToMe-VA: Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack

Ziyi Gao, Kai Chen, Zhipeng Wei, Tingshu Mou, Jingjing Chen, Zhiyu, Tan, Hao Li, Yu-Gang Jiang

PDF

Open Access

TL;DR

ReToMe-VA is a novel framework for generating imperceptible, highly transferable adversarial video clips using recursive token merging and latent space optimization within diffusion models, advancing video attack methods.

Contribution

It introduces the first recursive token merging mechanism for temporally consistent adversarial videos and a Timestep-wise Adversarial Latent Optimization strategy for improved imperceptibility and transferability.

Findings

01

Outperforms state-of-the-art attacks in transferability by over 14%.

02

Achieves high imperceptibility in both spatial and temporal domains.

03

Demonstrates effectiveness across various video datasets.

Abstract

Recent diffusion-based unrestricted attacks generate imperceptible adversarial examples with high transferability compared to previous unrestricted attacks and restricted attacks. However, existing works on diffusion-based unrestricted attacks are mostly focused on images yet are seldom explored in videos. In this paper, we propose the Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack (ReToMe-VA), which is the first framework to generate imperceptible adversarial video clips with higher transferability. Specifically, to achieve spatial imperceptibility, ReToMe-VA adopts a Timestep-wise Adversarial Latent Optimization (TALO) strategy that optimizes perturbations in diffusion models' latent space at each denoising step. TALO offers iterative and accurate updates to generate more powerful adversarial frames. TALO can further reduce memory consumption in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Digital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion