Sequence-Adaptive Video Prediction in Continuous Streams using Diffusion Noise Optimization
Sina Mokhtarzadeh Azar, Emad Bahrami, Enrico Pallotta, Gianpiero Francesca, Radu Timofte, Juergen Gall

TL;DR
This paper introduces SAVi-DNO, a method that adaptively refines diffusion noise during inference to improve continuous video prediction without costly model fine-tuning, validated on multiple datasets.
Contribution
Proposes a novel diffusion noise optimization technique for adaptive video prediction that maintains model parameters while improving performance during inference.
Findings
Improved FVD, SSIM, and PSNR metrics on Ego4D and OpenDV-YouTube datasets.
Effective adaptation in long continuous videos without parameter fine-tuning.
Validated on diverse datasets including UCF-101 and SkyTimelapse.
Abstract
In this work, we investigate diffusion-based video prediction models, which forecast future video frames, for continuous video streams. In this context, the models observe continuously new training samples, and we aim to leverage this to improve their predictions. We thus propose an approach that continuously adapts a pre-trained diffusion model to a video stream. Since fine-tuning the parameters of a large diffusion model is too expensive, we refine the diffusion noise during inference while keeping the model parameters frozen, allowing the model to adaptively determine suitable sampling noise. We term the approach Sequence Adaptive Video Prediction with Diffusion Noise Optimization (SAVi-DNO). To validate our approach, we introduce a new evaluation setting on the Ego4D dataset, focusing on simultaneous adaptation and evaluation on long continuous videos. Empirical results demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis
