Graphic Design with Large Multimodal Model
Yutao Cheng, Zhao Zhang, Maoke Yang, Hui Nie, Chunyuan Li, Xinglong, Wu, Jie Shao

TL;DR
This paper introduces Graphist, a novel large multimodal model for flexible graphic layout generation from unordered design elements, enhancing creativity and efficiency in graphic design automation.
Contribution
It presents Hierarchical Layout Generation (HLG) as a flexible approach and develops Graphist, the first model to reframe layout creation as a sequence generation task from unordered inputs.
Findings
Graphist outperforms previous methods in layout quality.
New evaluation metrics effectively assess HLG performance.
Graphist establishes a strong baseline for future research in graphic layout generation.
Abstract
In the field of graphic design, automating the integration of design elements into a cohesive multi-layered artwork not only boosts productivity but also paves the way for the democratization of graphic design. One existing practice is Graphic Layout Generation (GLG), which aims to layout sequential design elements. It has been constrained by the necessity for a predefined correct sequence of layers, thus limiting creative potential and increasing user workload. In this paper, we present Hierarchical Layout Generation (HLG) as a more flexible and pragmatic setup, which creates graphic composition from unordered sets of design elements. To tackle the HLG task, we introduce Graphist, the first layout generation model based on large multimodal models. Graphist efficiently reframes the HLG as a sequence generation problem, utilizing RGB-A images as input, outputs a JSON draft protocol,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDigital Media and Visual Art
