Patch-enhanced Mask Encoder Prompt Image Generation
Shusong Xu, Peiye Liu

TL;DR
This paper introduces a patch-enhanced mask encoder method for AI-generated image content that improves product description accuracy and background diversity, outperforming existing techniques in visual quality and FID scores.
Contribution
It proposes a novel patch-enhanced mask encoder approach with three components to better control content and backgrounds in AI-generated images for advertising.
Findings
Achieves higher visual quality and FID scores than previous methods.
Ensures more accurate product descriptions in generated images.
Preserves diverse backgrounds effectively.
Abstract
Artificial Intelligence Generated Content(AIGC), known for its superior visual results, represents a promising mitigation method for high-cost advertising applications. Numerous approaches have been developed to manipulate generated content under different conditions. However, a crucial limitation lies in the accurate description of products in advertising applications. Applying previous methods directly may lead to considerable distortion and deformation of advertised products, primarily due to oversimplified content control conditions. Hence, in this work, we propose a patch-enhanced mask encoder approach to ensure accurate product descriptions while preserving diverse backgrounds. Our approach consists of three components Patch Flexible Visibility, Mask Encoder Prompt Adapter and an image Foundation Model. Patch Flexible Visibility is used for generating a more reasonable background…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvancements in Photolithography Techniques · Manufacturing Process and Optimization
MethodsAdapter
