Training-Free Style Consistent Image Synthesis with Condition and Mask   Guidance in E-Commerce

Guandong Li

arXiv:2409.04750·cs.CV·September 10, 2024

Training-Free Style Consistent Image Synthesis with Condition and Mask Guidance in E-Commerce

Guandong Li

PDF

Open Access

TL;DR

This paper presents a train-free, style-consistent image synthesis method for e-commerce that leverages attention map modifications and mask guidance to produce high-quality images without additional training.

Contribution

It introduces a novel train-free approach using condition and mask guidance in attention maps to generate style-consistent images in e-commerce.

Findings

01

Effective style consistency in generated images

02

No additional training required for the method

03

Promising results in practical e-commerce applications

Abstract

Generating style-consistent images is a common task in the e-commerce field, and current methods are largely based on diffusion models, which have achieved excellent results. This paper introduces the concept of the QKV (query/key/value) level, referring to modifications in the attention maps (self-attention and cross-attention) when integrating UNet with image conditions. Without disrupting the product's main composition in e-commerce images, we aim to use a train-free method guided by pre-set conditions. This involves using shared KV to enhance similarity in cross-attention and generating mask guidance from the attention map to cleverly direct the generation of style-consistent images. Our method has shown promising results in practical applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis

MethodsSoftmax · Attention Is All You Need · Diffusion