DiffFAE: Advancing High-fidelity One-shot Facial Appearance Editing with   Space-sensitive Customization and Semantic Preservation

Qilin Wang; Jiangning Zhang; Chengming Xu; Weijian Cao; Ying Tai; Yue; Han; Yanhao Ge; Hong Gu; Chengjie Wang; Yanwei Fu

arXiv:2403.17664·cs.CV·March 27, 2024·1 cites

DiffFAE: Advancing High-fidelity One-shot Facial Appearance Editing with Space-sensitive Customization and Semantic Preservation

Qilin Wang, Jiangning Zhang, Chengming Xu, Weijian Cao, Ying Tai, Yue, Han, Yanhao Ge, Hong Gu, Chengjie Wang, Yanwei Fu

PDF

Open Access

TL;DR

DiffFAE is a diffusion-based framework that significantly improves high-fidelity facial appearance editing by ensuring attribute preservation and efficient inference through space-sensitive customization and semantic composition.

Contribution

It introduces SPC and RSC modules for better attribute transfer and preservation, along with a regularization technique for enhanced controllability, advancing the state-of-the-art in facial editing.

Findings

01

Achieves state-of-the-art results in facial appearance editing.

02

Demonstrates high fidelity and attribute preservation.

03

Ensures efficient inference in the editing process.

Abstract

Facial Appearance Editing (FAE) aims to modify physical attributes, such as pose, expression and lighting, of human facial images while preserving attributes like identity and background, showing great importance in photograph. In spite of the great progress in this area, current researches generally meet three challenges: low generation fidelity, poor attribute preservation, and inefficient inference. To overcome above challenges, this paper presents DiffFAE, a one-stage and highly-efficient diffusion-based framework tailored for high-fidelity FAE. For high-fidelity query attributes transfer, we adopt Space-sensitive Physical Customization (SPC), which ensures the fidelity and generalization ability by utilizing rendering texture derived from 3D Morphable Model (3DMM). In order to preserve source attributes, we introduce the Region-responsive Semantic Composition (RSC). This module is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion