GuideWeb: A Benchmark for Automatic In-App Guide Generation on Real-World Web UIs
Chengguang Gan, Yoshihiro Tsujii, Yunhao Liang, Tatsunori Mori, Shiwen Ni, Hiroki Itoh

TL;DR
GuideWeb introduces a benchmark for automatic generation of in-app guides on real-world web pages, addressing the challenge of maintaining guidance as website layouts evolve, and evaluates the performance of a new guide generation model.
Contribution
This work presents GuideWeb, a novel benchmark and evaluation suite for automatic in-app guide generation on real-world web UIs, including a new model called GuideWeb Agent.
Findings
GuideWeb Agent achieves 30.79% accuracy in guide target element prediction.
BLEU scores of 44.94 for intent generation and 21.34 for guide-text generation.
Existing methods perform significantly worse, indicating the task's difficulty.
Abstract
Digital Adoption Platform (DAP) provide web-based overlays that deliver operation guidance and contextual hints to help users navigate complex websites. Although modern DAP tools enable non-experts to author such guidance, maintaining these guides remains labor-intensive because website layouts and functionalities evolve continuously, which requires repeated manual updates and re-annotation. In this work, we introduce \textbf{GuideWeb}, a new benchmark for automatic in-app guide generation on real-world web UIs. GuideWeb formulates the task as producing page-level guidance by selecting \textbf{guide target elements} grounded in the webpage and generating concise guide text aligned with user intent. We also propose a comprehensive evaluation suite that jointly measures the accuracy of guide target element selection and the quality of generated intents and guide texts. Experiments show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Recommender Systems and Techniques · Spreadsheets and End-User Computing
