VRSight: An AI-Driven Scene Description System to Improve Virtual Reality Accessibility for Blind People
Daniel Killough, Justin Feng, Zheng Xue "ZX" Ching, Daniel Wang, Rithvik Dyava, Yapeng Tian, Yuhang Zhao

TL;DR
VRSight is an AI-driven system that enhances VR accessibility for blind users by recognizing scenes and providing audio feedback without requiring developer modifications.
Contribution
It introduces VRSight, an end-to-end VR scene recognition system and DISCOVR, a VR-specific dataset, to improve accessibility for blind users.
Findings
Participants successfully used VRSight for social VR tasks
VRSight effectively recognizes virtual objects and scenes
The system improves VR accessibility for blind users
Abstract
Virtual Reality (VR) is inaccessible to blind people. While research has investigated many techniques to enhance VR accessibility, they require additional developer effort to integrate. As such, most mainstream VR apps remain inaccessible as the industry de-prioritizes accessibility. We present VRSight, an end-to-end system that recognizes VR scenes post hoc through a set of AI models (e.g., object detection, depth estimation, LLM-based atmosphere interpretation) and generates tone-based, spatial audio feedback, empowering blind users to interact in VR without developer intervention. To enable virtual element detection, we further contribute DISCOVR, a VR dataset consisting of 30 virtual object classes from 17 social VR apps, substituting real-world datasets that remain not applicable to VR contexts. Nine participants used VRSight to explore an off-the-shelf VR app (Rec Room),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
