VRSight: An AI-Driven Scene Description System to Improve Virtual Reality Accessibility for Blind People

Daniel Killough; Justin Feng; Zheng Xue "ZX" Ching; Daniel Wang; Rithvik Dyava; Yapeng Tian; Yuhang Zhao

arXiv:2508.02958·cs.HC·August 6, 2025

VRSight: An AI-Driven Scene Description System to Improve Virtual Reality Accessibility for Blind People

Daniel Killough, Justin Feng, Zheng Xue "ZX" Ching, Daniel Wang, Rithvik Dyava, Yapeng Tian, Yuhang Zhao

PDF

TL;DR

VRSight is an AI-driven system that enhances VR accessibility for blind users by recognizing scenes and providing audio feedback without requiring developer modifications.

Contribution

It introduces VRSight, an end-to-end VR scene recognition system and DISCOVR, a VR-specific dataset, to improve accessibility for blind users.

Findings

01

Participants successfully used VRSight for social VR tasks

02

VRSight effectively recognizes virtual objects and scenes

03

The system improves VR accessibility for blind users

Abstract

Virtual Reality (VR) is inaccessible to blind people. While research has investigated many techniques to enhance VR accessibility, they require additional developer effort to integrate. As such, most mainstream VR apps remain inaccessible as the industry de-prioritizes accessibility. We present VRSight, an end-to-end system that recognizes VR scenes post hoc through a set of AI models (e.g., object detection, depth estimation, LLM-based atmosphere interpretation) and generates tone-based, spatial audio feedback, empowering blind users to interact in VR without developer intervention. To enable virtual element detection, we further contribute DISCOVR, a VR dataset consisting of 30 virtual object classes from 17 social VR apps, substituting real-world datasets that remain not applicable to VR contexts. Nine participants used VRSight to explore an off-the-shelf VR app (Rec Room),…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.