CoIE: Chain-of-Instruct Editing for Multi-Attribute Face Manipulation

Zhenduo Zhang; Bo-Wen Zhang; Guang Liu

arXiv:2312.07879·cs.CV·December 21, 2023·2 cites

CoIE: Chain-of-Instruct Editing for Multi-Attribute Face Manipulation

Zhenduo Zhang, Bo-Wen Zhang, Guang Liu

PDF

Open Access

TL;DR

This paper introduces Chain-of-Instruct Editing (CoIE), a step-by-step multi-attribute face editing method that leverages large language models and fine-tuning to improve precision and control in text-guided face manipulation.

Contribution

The paper proposes CoIE, combining LLM-generated instruction sequences, fine-tuning on a new dataset, and a super-resolution module to enhance multi-attribute face editing capabilities.

Findings

01

CLIPSim and Coverage metrics improved by 17.86% and 85.45%.

02

Preserve L1 and Quality metrics improved by 11.58% and 4.93%.

03

Significant boost in multi-attribute facial image manipulation.

Abstract

Current text-to-image editing models often encounter challenges with smoothly manipulating multiple attributes using a single instruction. Taking inspiration from the Chain-of-Thought prompting technique utilized in language models, we present an innovative concept known as Chain-of-Instruct Editing (CoIE), which enhances the capabilities of these models through step-by-step editing using a series of instructions. In particular, in the context of face manipulation, we leverage the contextual learning abilities of a pretrained Large Language Model (LLM), such as GPT-4, to generate a sequence of instructions from the original input, utilizing a purpose-designed 1-shot template. To further improve the precision of each editing step, we conduct fine-tuning on the editing models using our self-constructed instruction-guided face editing dataset, Instruct-CelebA. And additionally, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFacial Nerve Paralysis Treatment and Research · Herpesvirus Infections and Treatments · Face recognition and analysis

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Layer Normalization · Residual Connection · Dropout · Dense Connections · Position-Wise Feed-Forward Layer · Absolute Position Encodings