Zero-Shot Video Deraining with Video Diffusion Models
Tuomas Varanka, Juan Luis Gonzalez, Hyeongwoo Kim, Pablo Garrido, Xu Yao

TL;DR
This paper presents a novel zero-shot video deraining method that leverages pretrained text-to-video diffusion models, avoiding the need for synthetic data or fine-tuning, and effectively handles complex dynamic scenes.
Contribution
It introduces a zero-shot deraining approach using diffusion models with negative prompting and attention switching, eliminating the need for training on paired datasets.
Findings
Outperforms prior methods on real-world rain datasets
Maintains dynamic backgrounds and structural consistency
Demonstrates strong generalization without supervised training
Abstract
Existing video deraining methods are often trained on paired datasets, either synthetic, which limits their ability to generalize to real-world rain, or captured by static cameras, which restricts their effectiveness in dynamic scenes with background and camera motion. Furthermore, recent works in fine-tuning diffusion models have shown promising results, but the fine-tuning tends to weaken the generative prior, limiting generalization to unseen cases. In this paper, we introduce the first zero-shot video deraining method for complex dynamic scenes that does not require synthetic data nor model fine-tuning, by leveraging a pretrained text-to-video diffusion model that demonstrates strong generalization capabilities. By inverting an input video into the latent space of diffusion models, its reconstruction process can be intervened and pushed away from the model's concept of rain using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Enhancement Techniques · Generative Adversarial Networks and Image Synthesis · Visual Attention and Saliency Detection
