Predicting and Explaining Mobile UI Tappability with Vision Modeling and Saliency Analysis
Eldon Schoop, Xin Zhou, Gang Li, Zhourong Chen, Bj\"orn Hartmann, Yang Li

TL;DR
This paper introduces a deep learning approach that predicts tappability in mobile UI screenshots using only pixel data, and employs interpretability techniques to explain predictions and aid designers.
Contribution
It presents a novel pixel-based prediction model for tappability and combines interpretability methods to enhance understanding and actionable feedback for UI design.
Findings
High accuracy in tappability prediction from pixels
Effective visualization of influential regions via XRAI
Identification of similar UIs with contrasting tappability perceptions
Abstract
We use a deep learning based approach to predict whether a selected element in a mobile UI screenshot will be perceived by users as tappable, based on pixels only instead of view hierarchies required by previous work. To help designers better understand model predictions and to provide more actionable design feedback than predictions alone, we additionally use ML interpretability techniques to help explain the output of our model. We use XRAI to highlight areas in the input screenshot that most strongly influence the tappability prediction for the selected region, and use k-Nearest Neighbors to present the most similar mobile UIs from the dataset with opposing influences on tappability perception.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Innovative Human-Technology Interaction · Green IT and Sustainability
