A Lightweight Dual-Mode Optimization for Generative Face Video Coding

Zihan Zhang; Shanzhi Yin; Bolin Chen; Ru-Ling Liao; Shiqi Wang; Yan Ye

arXiv:2508.13547·cs.CV·August 20, 2025

A Lightweight Dual-Mode Optimization for Generative Face Video Coding

Zihan Zhang, Shanzhi Yin, Bolin Chen, Ru-Ling Liao, Shiqi Wang, Yan Ye

PDF

Open Access

TL;DR

This paper introduces a lightweight dual-mode optimization framework for Generative Face Video Coding that significantly reduces model complexity and computational costs while maintaining high-quality reconstruction, enabling deployment on resource-constrained devices.

Contribution

The paper proposes a novel dual-mode optimization combining architectural redesign and adaptive channel pruning to create a lightweight GFVC framework with reduced parameters and computation.

Findings

01

Achieves 90.4% parameter reduction and 88.9% computation saving.

02

Outperforms state-of-the-art VVC in perceptual quality metrics.

03

Enables efficient GFVC deployment on mobile edge devices.

Abstract

Generative Face Video Coding (GFVC) achieves superior rate-distortion performance by leveraging the strong inference capabilities of deep generative models. However, its practical deployment is hindered by large model parameters and high computational costs. To address this, we propose a lightweight GFVC framework that introduces dual-mode optimization -- combining architectural redesign and operational refinement -- to reduce complexity whilst preserving reconstruction quality. Architecturally, we replace traditional 3 x 3 convolutions with slimmer and more efficient layers, reducing complexity without compromising feature expressiveness. Operationally, we develop a two-stage adaptive channel pruning strategy: (1) soft pruning during training identifies redundant channels via learnable thresholds, and (2) hard pruning permanently eliminates these channels post-training using a derived…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Video Coding and Compression Technologies · Image and Video Stabilization