Multimodal Color Recommendation in Vector Graphic Documents
Qianru Qiu, Xueting Wang, Mayu Otani

TL;DR
This paper introduces a multimodal masked color model that combines color and textual contexts to improve color recommendation in graphic documents, enhancing palette completion and generation accuracy.
Contribution
It presents a novel multimodal model integrating self-attention and cross-attention with CLIP-based text representations for improved color recommendation tasks.
Findings
Outperforms previous methods in palette completion accuracy
Achieves higher color diversity and similarity in palette generation
Enhances user experience in color recommendation tasks
Abstract
Color selection plays a critical role in graphic document design and requires sufficient consideration of various contexts. However, recommending appropriate colors which harmonize with the other colors and textual contexts in documents is a challenging task, even for experienced designers. In this study, we propose a multimodal masked color model that integrates both color and textual contexts to provide text-aware color recommendation for graphic documents. Our proposed model comprises self-attention networks to capture the relationships between colors in multiple palettes, and cross-attention networks that incorporate both color and CLIP-based text representations. Our proposed method primarily focuses on color palette completion, which recommends colors based on the given colors and text. Additionally, it is applicable for another color recommendation task, full palette generation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsColor perception and design · Color Science and Applications
