One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution

Yujing Sun; Lingchen Sun; Shuaizheng Liu; Rongyuan Wu; Zhengqiang Zhang; Lei Zhang

arXiv:2506.15591·cs.CV·October 23, 2025

One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution

Yujing Sun, Lingchen Sun, Shuaizheng Liu, Rongyuan Wu, Zhengqiang Zhang, Lei Zhang

PDF

Open Access

TL;DR

This paper introduces a novel one-step diffusion approach with dual LoRA learning for producing detail-rich, temporally consistent videos in super-resolution tasks, improving quality and efficiency.

Contribution

It proposes a dual LoRA learning paradigm with cross-frame retrieval and separate detail enhancement modules for the first time in real-world video super-resolution.

Findings

01

Achieves high-quality, temporally consistent video super-resolution in a single diffusion step.

02

Outperforms existing methods in accuracy and speed.

03

Demonstrates effective extraction of temporal priors from degraded videos.

Abstract

It is a challenging problem to reproduce rich spatial details while maintaining temporal consistency in real-world video super-resolution (Real-VSR), especially when we leverage pre-trained generative models such as stable diffusion (SD) for realistic details synthesis. Existing SD-based Real-VSR methods often compromise spatial details for temporal coherence, resulting in suboptimal visual quality. We argue that the key lies in how to effectively extract the degradation-robust temporal consistency priors from the low-quality (LQ) input video and enhance the video details while maintaining the extracted consistency priors. To achieve this, we propose a Dual LoRA Learning (DLoRAL) paradigm to train an effective SD-based one-step diffusion model, achieving realistic frame details and temporal consistency simultaneously. Specifically, we introduce a Cross-Frame Retrieval (CFR) module to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis · Image and Video Quality Assessment