Morae: Proactively Pausing UI Agents for User Choices
Yi-Hao Peng, Dingzeyu Li, Jeffrey P. Bigham, Amy Pavel

TL;DR
Morae is a proactive UI agent that identifies decision points during tasks, pauses to involve BLV users in choices, and improves task success and preference alignment using multimodal models.
Contribution
This paper introduces Morae, a novel UI agent that proactively pauses for user choices using multimodal models, enhancing user agency and task outcomes for BLV users.
Findings
Morae increased task completion rates for BLV users.
Morae enabled better matching of options to user preferences.
Participants found Morae more helpful than baseline agents.
Abstract
User interface (UI) agents promise to make inaccessible or complex UIs easier to access for blind and low-vision (BLV) users. However, current UI agents typically perform tasks end-to-end without involving users in critical choices or making them aware of important contextual information, thus reducing user agency. For example, in our field study, a BLV participant asked to buy the cheapest available sparkling water, and the agent automatically chose one from several equally priced options, without mentioning alternative products with different flavors or better ratings. To address this problem, we introduce Morae, a UI agent that automatically identifies decision points during task execution and pauses so that users can make choices. Morae uses large multimodal models to interpret user queries alongside UI code and screenshots, and prompt users for clarification when there is a choice…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTactile and Sensory Interactions · Digital Accessibility for Disabilities · Multimodal Machine Learning Applications
