HelpViz: Automatic Generation of Contextual Visual MobileTutorials from Text-Based Instructions
Mingyuan Zhong, Gang Li, Peggy Chi, Yang Li

TL;DR
HelpViz automatically converts web text instructions into interactive visual mobile tutorials by parsing actions, executing them in emulators, and synthesizing multimedia-guided guides that adapt to user progress.
Contribution
This paper introduces HelpViz, a novel system that automates the creation of contextual visual tutorials from text instructions using parsing, simulation, and synthesis techniques.
Findings
Improved robustness in tutorial execution.
Participants preferred HelpViz-generated tutorials.
Cost-effective scalable tutorial generation.
Abstract
We present HelpViz, a tool for generating contextual visual mobile tutorials from text-based instructions that are abundant on the web. HelpViz transforms text instructions to graphical tutorials in batch, by extracting a sequence of actions from each text instruction through an instruction parsing model, and executing the extracted actions on a simulation infrastructure that manages an array of Android emulators. The automatic execution of each instruction produces a set of graphical and structural assets, including images, videos, and metadata such as clicked elements for each step. HelpViz then synthesizes a tutorial by combining parsed text instructions with the generated assets, and contextualizes the tutorial to user interaction by tracking the user's progress and highlighting the next step. Our experiments with HelpViz indicate that our pipeline improved tutorial execution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Mobile Learning in Education
