Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking

Zihan Su; Xuerui Qiu; Hongbin Xu; Tangyu Jiang; Junhao Zhuang; Chun Yuan; Ming Li; Shengfeng He; Fei Richard Yu

arXiv:2505.12667·cs.CV·September 23, 2025

Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking

Zihan Su, Xuerui Qiu, Hongbin Xu, Tangyu Jiang, Junhao Zhuang, Chun Yuan, Ming Li, Shengfeng He, Fei Richard Yu

PDF

Open Access

TL;DR

Safe-Sora introduces a novel graphical watermarking framework for text-to-video generation, ensuring copyright protection while maintaining high video quality and robustness through hierarchical matching and spatiotemporal modeling.

Contribution

It is the first to embed graphical watermarks directly into video generation using hierarchical matching and state space models for enhanced robustness.

Findings

01

Achieves state-of-the-art watermark robustness and fidelity.

02

Maintains high video quality with embedded watermarks.

03

Demonstrates effective long-range dependency modeling in watermarking.

Abstract

The explosive growth of generative video models has amplified the demand for reliable copyright preservation of AI-generated content. Despite its popularity in image synthesis, invisible generative watermarking remains largely underexplored in video generation. To address this gap, we propose Safe-Sora, the first framework to embed graphical watermarks directly into the video generation process. Motivated by the observation that watermarking performance is closely tied to the visual similarity between the watermark and cover content, we introduce a hierarchical coarse-to-fine adaptive matching mechanism. Specifically, the watermark image is divided into patches, each assigned to the most visually similar video frame, and further localized to the optimal spatial region for seamless embedding. To enable spatiotemporal fusion of watermark patches across video frames, we develop a 3D…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Steganography and Watermarking Techniques · Generative Adversarial Networks and Image Synthesis · Image Enhancement Techniques

MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces