Latent Space Probing for Adult Content Detection in Video Generative Models
Alizishaan Khatri, Chiquita Prabhu

TL;DR
This paper introduces a latent space probing framework for real-time adult content detection in video generative models, leveraging internal representations to improve accuracy and efficiency.
Contribution
It proposes a novel latent space probing method with lightweight classifiers, achieving high detection accuracy and low computational overhead.
Findings
Achieved 97.29% F1 score on the dataset.
Latent signals encode strong discriminative features.
Probing the latent space improves detection performance and reduces cost.
Abstract
The rapid proliferation of AI-powered video generation systems has introduced significant challenges in content moderation, particularly with respect to adult and sexually explicit material. Existing detection methods operate on either prompts or decoded pixel-space outputs. Therefore, both approaches are blind to the rich internal representations formed during generation. In this paper, we propose a novel latent space probing framework that intercepts the denoised latent representations produced by the CogVideoX video diffusion model during inference and attaches lightweight classifiers to perform real-time adult content detection. To support this work, we construct a large-scale binary dataset of 11039 ten-second video clips (5086 violating, 5953 non-violating) sourced from adult websites and YouTube respectively. We introduce two lightweight probing classifier architectures. We train…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
