NumColor: Precise Numeric Color Control in Text-to-Image Generation
Muhammad Atif Butt, Diego Hernandez, Alexandra Gomez-Villa, Kai Wang, Javier Vazquez-Corral, Joost Van De Weijer

TL;DR
NumColor introduces a novel approach for precise numerical color control in text-to-image diffusion models, overcoming tokenization limitations and enabling accurate, smooth color interpolation across multiple architectures.
Contribution
It presents a new Color Token Aggregator and a learnable ColorBook with auxiliary losses, facilitating accurate color control and transferability in diffusion models.
Findings
Improves numerical color accuracy by 4-9x across models.
Enhances color harmony scores by 10-30x on GenColorBench.
Enables zero-shot transfer to multiple diffusion architectures.
Abstract
Text-to-image diffusion models excel at generating images from natural language descriptions, yet fail to interpret numerical colors such as hex codes (#FF5733) and RGB values (rgb(255,87,51)). This limitation stems from subword tokenization, which fragments color codes into semantically meaningless tokens that text encoders cannot map to coherent color representations. We present NumColor, that enables precise numerical color control across multiple diffusion architectures. NumColor comprises two components: a Color Token Aggregator that detects color specifications regardless of tokenization, and a ColorBook containing 6,707 learnable embeddings that map colors to embedding space of text encoder in perceptually uniform CIE Lab space. We introduce two auxiliary losses, directional alignment and interpolation consistency, to enforce geometric correspondence between Lab and embedding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Humanities and Scholarship · Computer Graphics and Visualization Techniques
