DicFace: Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration

Yan Chen; Hanlin Shang; Ce Liu; Yuxuan Chen; Hui Li; Weihao Yuan; Hao Zhu; Zilong Dong; Siyu Zhu

arXiv:2506.13355·cs.CV·June 17, 2025

DicFace: Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration

Yan Chen, Hanlin Shang, Ce Liu, Yuxuan Chen, Hui Li, Weihao Yuan, Hao Zhu, Zilong Dong, Siyu Zhu

PDF

Open Access 1 Models

TL;DR

DicFace introduces a Dirichlet-constrained variational codebook learning framework that leverages pretrained VQ-VAEs and spatio-temporal Transformers to achieve temporally coherent and high-quality video face restoration, effectively reducing flicker artifacts.

Contribution

The paper proposes a novel Dirichlet-distributed latent space and a spatio-temporal Transformer architecture for improved video face restoration, extending static image priors to video tasks.

Findings

01

Achieves state-of-the-art results on face restoration benchmarks.

02

Effectively reduces flicker artifacts in video sequences.

03

Demonstrates versatility across multiple video restoration tasks.

Abstract

Video face restoration faces a critical challenge in maintaining temporal consistency while recovering fine facial details from degraded inputs. This paper presents a novel approach that extends Vector-Quantized Variational Autoencoders (VQ-VAEs), pretrained on static high-quality portraits, into a video restoration framework through variational latent space modeling. Our key innovation lies in reformulating discrete codebook representations as Dirichlet-distributed continuous variables, enabling probabilistic transitions between facial features across frames. A spatio-temporal Transformer architecture jointly models inter-frame dependencies and predicts latent distributions, while a Laplacian-constrained reconstruction loss combined with perceptual (LPIPS) regularization enhances both pixel accuracy and visual quality. Comprehensive evaluations on blind face restoration, video…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
fudan-generative-ai/DicFace_model
model· ♡ 2
♡ 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Advanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis