Can GPTs Evaluate Graphic Design Based on Design Principles?
Daichi Haraguchi, Naoto Inoue, Wataru Shimoda, Hayato Mitani, Seiichi, Uchida, Kota Yamaguchi

TL;DR
This paper investigates whether GPT models can reliably evaluate graphic design quality by comparing their assessments to human judgments and heuristic metrics based on fundamental design principles.
Contribution
It demonstrates that GPTs, despite limitations, show a good correlation with human evaluations and heuristic metrics in assessing graphic design quality.
Findings
GPTs correlate well with human annotations
GPTs exhibit similar tendencies to heuristic metrics
GPTs cannot detect small detail differences
Abstract
Recent advancements in foundation models show promising capability in graphic design generation. Several studies have started employing Large Multimodal Models (LMMs) to evaluate graphic designs, assuming that LMMs can properly assess their quality, but it is unclear if the evaluation is reliable. One way to evaluate the quality of graphic design is to assess whether the design adheres to fundamental graphic design principles, which are the designer's common practice. In this paper, we compare the behavior of GPT-based evaluation and heuristic evaluation based on design principles using human annotations collected from 60 subjects. Our experiments reveal that, while GPTs cannot distinguish small details, they have a reasonably good correlation with human annotation and exhibit a similar tendency to heuristic metrics based on design principles, suggesting that they are indeed capable of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning
