TL;DR
DVFace is a novel one-step diffusion framework that leverages spatio-temporal priors for high-quality, temporally consistent video face restoration, outperforming existing methods.
Contribution
The paper introduces a dual-codebook design and asymmetric fusion module enabling effective one-step diffusion for video face restoration.
Findings
DVFace achieves superior restoration quality on benchmarks.
It maintains better temporal consistency than prior methods.
It preserves facial identity effectively during restoration.
Abstract
Video face restoration aims to enhance degraded face videos into high-quality results with realistic facial details, stable identity, and temporal coherence. Recent diffusion-based methods have brought strong generative priors to restoration and enabled more realistic detail synthesis. However, existing approaches for face videos still rely heavily on generic diffusion priors and multi-step sampling, which limit both facial adaptation and inference efficiency. These limitations motivate the use of one-step diffusion for video face restoration, yet achieving faithful facial recovery alongside temporally stable outputs remains challenging. In this paper, we propose, DVFace, a one-step diffusion framework for real-world video face restoration. Specifically, we introduce a spatio-temporal dual-codebook design to extract complementary spatial and temporal facial priors from degraded videos.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
