Towards Flexible, Scalable, and Adaptive Multi-Modal Conditioned Face   Synthesis

Jingjing Ren; Cheng Xu; Haoyu Chen; Xinran Qin; Lei Zhu

arXiv:2312.16274·cs.CV·March 22, 2024·2 cites

Towards Flexible, Scalable, and Adaptive Multi-Modal Conditioned Face Synthesis

Jingjing Ren, Cheng Xu, Haoyu Chen, Xinran Qin, Lei Zhu

PDF

Open Access

TL;DR

This paper introduces a novel multi-modal face synthesis method that enhances flexibility, scalability, and control by using uni-modal training with modal surrogates and entropy-aware modulation, leading to higher quality results.

Contribution

The paper proposes a uni-modal training approach with modal surrogates and entropy-aware modulation for improved multi-modal face synthesis.

Findings

01

Outperforms existing methods in image quality and fidelity

02

Supports flexible and scalable multi-modal control

03

Achieves high-fidelity face synthesis results

Abstract

Recent progress in multi-modal conditioned face synthesis has enabled the creation of visually striking and accurately aligned facial images. Yet, current methods still face issues with scalability, limited flexibility, and a one-size-fits-all approach to control strength, not accounting for the differing levels of conditional entropy, a measure of unpredictability in data given some condition, across modalities. To address these challenges, we introduce a novel uni-modal training approach with modal surrogates, coupled with an entropy-aware modal-adaptive modulation, to support flexible, scalable, and scalable multi-modal conditioned face synthesis network. Our uni-modal training with modal surrogate that only leverage uni-modal data, use modal surrogate to decorate condition with modal-specific characteristic and serve as linker for inter-modal collaboration , fully learns each…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion