Towards Understanding Unsafe Video Generation
Yan Pang, Aiping Xiong, Yang Zhang, Tianhao Wang

TL;DR
This paper investigates the potential for video generation models to produce unsafe content, creating a new dataset of unsafe videos and proposing a novel defense mechanism called Latent Variable Defense to mitigate such risks.
Contribution
It provides the first dataset of unsafe videos generated by VGMs and introduces Latent Variable Defense, a new method to prevent unsafe video generation within the model's sampling process.
Findings
Generated 2112 unsafe videos from open-source models and prompts.
Identified five unsafe video categories through analysis.
Proposed LVD achieves 0.90 defense accuracy with 10x efficiency.
Abstract
Video generation models (VGMs) have demonstrated the capability to synthesize high-quality output. It is important to understand their potential to produce unsafe content, such as violent or terrifying videos. In this work, we provide a comprehensive understanding of unsafe video generation. First, to confirm the possibility that these models could indeed generate unsafe videos, we choose unsafe content generation prompts collected from 4chan and Lexica, and three open-source SOTA VGMs to generate unsafe videos. After filtering out duplicates and poorly generated content, we created an initial set of 2112 unsafe videos from an original pool of 5607 videos. Through clustering and thematic coding analysis of these generated videos, we identify 5 unsafe video categories: Distorted/Weird, Terrifying, Pornographic, Violent/Bloody, and Political. With IRB approval, we then recruit online…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning · Video Analysis and Summarization
MethodsSparse Evolutionary Training · Focus
