UICrit: Enhancing Automated Design Evaluation with a UICritique Dataset

Peitong Duan; Chin-yi Chen; Gang Li; Bjoern Hartmann; Yang Li

arXiv:2407.08850·cs.HC·August 15, 2024

UICrit: Enhancing Automated Design Evaluation with a UICritique Dataset

Peitong Duan, Chin-yi Chen, Gang Li, Bjoern Hartmann, Yang Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces UICrit, a dataset of 3,059 UI critiques aimed at improving automated UI evaluation by enhancing LLM performance through targeted feedback, with potential applications in training reward models and fine-tuning multi-modal LLMs.

Contribution

The paper presents a new dataset of UI critiques and demonstrates how it significantly improves LLM-based UI evaluation performance.

Findings

01

55% performance improvement in LLM-generated UI feedback

02

Dataset contains 3,059 critiques for 983 mobile UIs

03

Potential for training reward models and fine-tuning multi-modal LLMs

Abstract

Automated UI evaluation can be beneficial for the design process; for example, to compare different UI designs, or conduct automated heuristic evaluation. LLM-based UI evaluation, in particular, holds the promise of generalizability to a wide variety of UI types and evaluation tasks. However, current LLM-based techniques do not yet match the performance of human evaluators. We hypothesize that automatic evaluation can be improved by collecting a targeted UI feedback dataset and then using this dataset to enhance the performance of general-purpose LLMs. We present a targeted dataset of 3,059 design critiques and quality ratings for 983 mobile UIs, collected from seven experienced designers. We carried out an in-depth analysis to characterize the dataset's features. We then applied this dataset to achieve a 55% performance gain in LLM-generated UI feedback via various few-shot and visual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research-datasets/uicrit
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Reliability and Analysis Research · Embedded Systems Design Techniques · Adversarial Robustness in Machine Learning