DEsignBench: Exploring and Benchmarking DALL-E 3 for Imagining Visual Design
Kevin Lin, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Lijuan Wang

TL;DR
DEsignBench is a new benchmark for evaluating text-to-image models like DALL-E 3 in authentic visual design scenarios, combining human and GPT-4V evaluations to assess technical and creative design capabilities.
Contribution
We introduce DEsignBench, a comprehensive benchmark for assessing T2I models in design contexts, including a novel automatic evaluation method using GPT-4V.
Findings
DALL-E 3 performs well in image-text alignment and aesthetic quality.
GPT-4V-based evaluation correlates strongly with human judgments.
Design-specific capabilities like layout and color harmony are effectively assessed.
Abstract
We introduce DEsignBench, a text-to-image (T2I) generation benchmark tailored for visual design scenarios. Recent T2I models like DALL-E 3 and others, have demonstrated remarkable capabilities in generating photorealistic images that align closely with textual inputs. While the allure of creating visually captivating images is undeniable, our emphasis extends beyond mere aesthetic pleasure. We aim to investigate the potential of using these powerful models in authentic design contexts. In pursuit of this goal, we develop DEsignBench, which incorporates test samples designed to assess T2I models on both "design technical capability" and "design application scenario." Each of these two dimensions is supported by a diverse set of specific design categories. We explore DALL-E 3 together with other leading T2I models on DEsignBench, resulting in a comprehensive visual gallery for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAesthetic Perception and Analysis · Visual Attention and Saliency Detection · Digital Media and Visual Art
MethodsSparse Evolutionary Training · ALIGN
