Improving Temporal Consistency and Fidelity at Inference-time in Perceptual Video Restoration by Zero-shot Image-based Diffusion Models

Nasrin Rahimi; A. Murat Tekalp

arXiv:2510.25420·eess.IV·October 30, 2025

Improving Temporal Consistency and Fidelity at Inference-time in Perceptual Video Restoration by Zero-shot Image-based Diffusion Models

Nasrin Rahimi, A. Murat Tekalp

PDF

TL;DR

This paper introduces inference-time strategies to enhance temporal consistency and fidelity in zero-shot perceptual video restoration using diffusion models, without retraining, by guiding denoising and ensembling trajectories.

Contribution

It proposes two novel inference-time techniques, PSG and MPES, to improve temporal coherence and fidelity in video restoration with pretrained diffusion models.

Findings

01

PSG improves temporal perceptual scores like FVD and straightness.

02

MPES enhances fidelity metrics such as PSNR and SSIM.

03

Combined, these methods enable stable, high-quality video restoration without retraining.

Abstract

Diffusion models have emerged as powerful priors for single-image restoration, but their application to zero-shot video restoration suffers from temporal inconsistencies due to the stochastic nature of sampling and complexity of incorporating explicit temporal modeling. In this work, we address the challenge of improving temporal coherence in video restoration using zero-shot image-based diffusion models without retraining or modifying their architecture. We propose two complementary inference-time strategies: (1) Perceptual Straightening Guidance (PSG) based on the neuroscience-inspired perceptual straightening hypothesis, which steers the diffusion denoising process towards smoother temporal evolution by incorporating a curvature penalty in a perceptual space to improve temporal perceptual scores, such as Fr\'echet Video Distance (FVD) and perceptual straightness; and (2) Multi-Path…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.