Does Semantic Noise Initialization Transfer from Images to Videos? A Paired Diagnostic Study
Yixiao Jing, Chaoyu Zhang, Zixuan Zhong, Peizhou Huang

TL;DR
This study investigates whether semantic noise initialization improves text-to-video diffusion models, finding limited transferability and emphasizing the need for careful evaluation and diagnostics in this domain.
Contribution
It provides the first benchmark comparing semantic noise initialization to Gaussian noise in T2V models, highlighting its limited benefits and proposing evaluation standards.
Findings
Small positive trend in temporal dimensions, but not statistically significant
Overall performance remains comparable to baseline
Noise perturbation analysis reveals weak or unstable signals
Abstract
Semantic noise initialization has been reported to improve robustness and controllability in image diffusion models. Whether these gains transfer to text-to-video (T2V) generation remains unclear, since temporal coupling can introduce extra degrees of freedom and instability. We benchmark semantic noise initialization against standard Gaussian noise using a frozen VideoCrafter-style T2V diffusion backbone and VBench on 100 prompts. Using prompt-level paired tests with bootstrap confidence intervals and a sign-flip permutation test, we observe a small positive trend on temporal-related dimensions; however, the 95 percent confidence interval includes zero (p ~ 0.17) and the overall score remains on par with the baseline. To understand this outcome, we analyze the induced perturbations in noise space and find patterns consistent with weak or unstable signal. We recommend prompt-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Neural dynamics and brain function · Advanced Neuroimaging Techniques and Applications
