TL;DR
GIST is a training-free image compositing tool that improves visual harmony in design pipelines by harmonizing disparate input elements without requiring retraining.
Contribution
We introduce GIST, a novel identity-preserving image compositor that seamlessly integrates into existing design pipelines to enhance visual harmony and aesthetic quality.
Findings
GIST significantly improves visual harmony in design pipelines.
Integration of GIST with existing methods enhances aesthetic quality.
Validated by ratings and preferences from LLaVA-OV and GPT-4V.
Abstract
Graphic design creation involves harmoniously assembling multimodal components such as images, text, logos, and other visual assets collected from diverse sources, into a visually-appealing and cohesive design. Recent methods have largely focused on layout prediction or complementary element generation, while retaining input elements exactly, implicitly assuming that provided components are already stylistically harmonious. In practice, inputs often come from disparate sources and exhibit visual mismatch, making this assumption limiting. We argue that identity-preserving stylization and compositing of input elements is a critical missing ingredient for truly harmonized components-to-design pipelines. To this end, we propose GIST, a training-free, identity-preserving image compositor that sits between layout prediction and typography generation, and can be plugged into any existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
