Qualitative Evaluation of LLM-Designed GUI

Bartosz Sawicki; Tomasz Les; Dariusz Parzych; Aleksandra Wycisk-Ficek; Pawel Trebacz; Pawel Zawadzki

arXiv:2601.22759·cs.HC·February 2, 2026

Qualitative Evaluation of LLM-Designed GUI

Bartosz Sawicki, Tomasz Les, Dariusz Parzych, Aleksandra Wycisk-Ficek, Pawel Trebacz, Pawel Zawadzki

PDF

Open Access

TL;DR

This study qualitatively assesses the capabilities of large language models in designing user interfaces, highlighting their strengths in layout generation and limitations in accessibility and interactivity, emphasizing the need for human oversight.

Contribution

It provides a comprehensive evaluation of LLM-generated GUIs across different models and interface types, revealing their potential and current limitations in usability and adaptability.

Findings

01

LLMs effectively generate structured layouts.

02

Challenges in meeting accessibility standards.

03

Partial customization for user personas.

Abstract

As generative artificial intelligence advances, Large Language Models (LLMs) are being explored for automated graphical user interface (GUI) design. This study investigates the usability and adaptability of LLM-generated interfaces by analysing their ability to meet diverse user needs. The experiments included utilization of three state-of-the-art models from January 2025 (OpenAI GPT o3-mini-high, DeepSeek R1, and Anthropic Claude 3.5 Sonnet) generating mockups for three interface types: a chat system, a technical team panel, and a manager dashboard. Expert evaluations revealed that while LLMs are effective at creating structured layouts, they face challenges in meeting accessibility standards and providing interactive functionality. Further testing showed that LLMs could partially tailor interfaces for different user personas but lacked deeper contextual understanding. The results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPersona Design and Applications · AI in Service Interactions · Technology Use by Older Adults