Emerging Practices for Large Multimodal Model (LMM) Assistance for People with Visual Impairments: Implications for Design
Jingyi Xie, Rui Yu, He Zhang, Sooyeon Lee, Syed Masum Billah, John M., Carroll

TL;DR
This study explores how large multimodal models assist visually impaired users, revealing their adaptive use and cognitive extension, and discusses design implications for more effective assistive AI tools.
Contribution
It provides empirical insights into user interactions with LMM-based assistive tools and suggests design directions for goal-oriented, real-time AI assistance.
Findings
Users adapt tools for complex social and spatial tasks.
Tools extend users' cognition by distributing visual understanding.
Users accept current limitations and utilize broad capabilities.
Abstract
People with visual impairments perceive their environment non-visually and often use AI-powered assistive tools to obtain textual descriptions of visual information. Recent large vision-language model-based AI-powered tools like Be My AI are more capable of understanding users' inquiries in natural language and describing the scene in audible text; however, the extent to which these tools are useful to visually impaired users is currently understudied. This paper aims to fill this gap. Our study with 14 visually impaired users reveals that they are adapting these tools organically -- not only can these tools facilitate complex interactions in household, spatial, and social contexts, but they also act as an extension of users' cognition, as if the cognition were distributed in the visual information. We also found that although the tools are currently not goal-oriented, users accommodate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Accessibility for Disabilities · Tactile and Sensory Interactions
