Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs
Houston H. Zhang, Tao Zhang, Baoze Lin, Yuanqi Xue, Yincheng Zhu, Huan Liu, Li Gu, Linfeng Ye, Ziqiang Wang, Xinxin Zuo, Yang Wang, Yuanhao Yu, Zhixiang Chi

TL;DR
This paper introduces Widget2Code, a new benchmark and multimodal approach for converting compact, context-free UI widgets into executable code, addressing the unique challenges of widget design and proprietary data.
Contribution
It formalizes the Widget2Code task, creates a widget-specific benchmark, and develops a baseline system with a domain-specific language and compiler for multi-platform code generation.
Findings
MLLMs outperform specialized methods but lack reliability.
The baseline improves visual fidelity and code accuracy.
The infrastructure supports multi-platform UI code generation.
Abstract
User interface to code (UI2Code) aims to generate executable code that can faithfully reconstruct a given input UI. Prior work focuses largely on web pages and mobile screens, leaving app widgets underexplored. Unlike web or mobile UIs with rich hierarchical context, widgets are compact, context-free micro-interfaces that summarize key information through dense layouts and iconography under strict spatial constraints. Moreover, while (image, code) pairs are widely available for web or mobile UIs, widget designs are proprietary and lack accessible markup. We formalize this setting as the Widget-to-Code (Widget2Code) and introduce an image-only widget benchmark with fine-grained, multi-dimensional evaluation metrics. Benchmarking shows that although generalized multimodal large language models (MLLMs) outperform specialized UI2Code methods, they still produce unreliable and visually…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Interactive and Immersive Displays · Usability and User Interface Design
