Face-MakeUp: Multimodal Facial Prompts for Text-to-Image Generation

Dawei Dai; Mingming Jia; Yinxiu Zhou; Hang Xing; Chenghang Li

arXiv:2501.02523·cs.CV·January 7, 2025

Face-MakeUp: Multimodal Facial Prompts for Text-to-Image Generation

Dawei Dai, Mingming Jia, Yinxiu Zhou, Hang Xing, Chenghang Li

PDF

Open Access 1 Repo 2 Models 1 Datasets

TL;DR

This paper introduces Face-MakeUp, a multimodal approach that uses facial prompts to improve text-to-image diffusion models for generating specific facial images, leveraging a large dataset and feature extraction techniques.

Contribution

The paper presents a new dataset, FaceCaptionHQ-4M, and a method to incorporate multi-scale features into diffusion models for better facial image generation.

Findings

01

Face-MakeUp outperforms existing methods on face-related datasets.

02

The model effectively preserves facial identity features.

03

The approach demonstrates strong performance in generating desired facial images.

Abstract

Facial images have extensive practical applications. Although the current large-scale text-image diffusion models exhibit strong generation capabilities, it is challenging to generate the desired facial images using only text prompt. Image prompts are a logical choice. However, current methods of this type generally focus on general domain. In this paper, we aim to optimize image makeup techniques to generate the desired facial images. Specifically, (1) we built a dataset of 4 million high-quality face image-text pairs (FaceCaptionHQ-4M) based on LAION-Face to train our Face-MakeUp model; (2) to maintain consistency with the reference facial image, we extract/learn multi-scale content features and pose features for the facial image, integrating these into the diffusion model to enhance the preservation of facial identity features for diffusion models. Validation on two face-related test…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ddw2aigroup2cqupt/face-makeup
pytorchOfficial

Models

Datasets

OpenFace-CQUPT/FaceCaptionHQ-4M
dataset· 48 dl
48 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Subtitles and Audiovisual Media · Digital Media and Visual Art

MethodsDiffusion · Focus