Interaction2Code: Benchmarking MLLM-based Interactive Webpage Code Generation from Interactive Prototyping

Jingyu Xiao; Yuxuan Wan; Yintong Huo; Zixin Wang; Xinyi Xu; Wenxuan Wang; Zhiyao Xu; Yuhang Wang; Michael R. Lyu

arXiv:2411.03292·cs.SE·March 3, 2026

Interaction2Code: Benchmarking MLLM-based Interactive Webpage Code Generation from Interactive Prototyping

Jingyu Xiao, Yuxuan Wan, Yintong Huo, Zixin Wang, Xinyi Xu, Wenxuan Wang, Zhiyao Xu, Yuhang Wang, Michael R. Lyu

PDF

1 Repo

TL;DR

This paper introduces the Interaction2Code benchmark to evaluate multimodal large language models on generating interactive web pages from prototypes, highlighting current limitations and proposing strategies for improvement.

Contribution

It formulates the Interaction-to-Code task, creates a comprehensive benchmark with diverse interactions, and proposes enhancement strategies to improve MLLMs' performance on interactive webpage generation.

Findings

01

MLLMs struggle with interaction generation compared to full pages

02

Identified ten failure types in current models

03

Proposed strategies improve interaction understanding and generation

Abstract

Multimodal Large Language Models (MLLMs) have demonstrated remarkable performance on the design-to-code task, i.e., generating UI code from UI mock-ups. However, existing benchmarks only contain static web pages for evaluation and ignore the dynamic interaction, limiting the practicality, usability and user engagement of the generated webpages. To bridge these gaps, we present the first systematic investigation of MLLMs in generating interactive webpages. Specifically, we formulate the Interaction-to-Code task and establish the Interaction2Code benchmark, encompassing 127 unique webpages and 374 distinct interactions across 15 webpage types and 31 interaction categories. Through comprehensive experiments utilizing state-of-the-art (SOTA) MLLMs, evaluated via both automatic metrics and human assessments, we identify four critical limitations of MLLM on Interaction-to-Code task: (1)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

webpai/interaction2code
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.