ViDDAR: Vision Language Model-Based Task-Detrimental Content Detection for Augmented Reality
Yanming Xiu, Tim Scargill, Maria Gorlatova

TL;DR
This paper introduces ViDDAR, a novel vision-language model-based system designed to detect task-detrimental virtual content in augmented reality, addressing obstruction and information manipulation attacks to improve user task performance.
Contribution
ViDDAR is the first system to utilize vision-language models for detecting harmful virtual content in AR, combining deep learning with a user-edge-cloud architecture for effective real-time monitoring.
Findings
Achieves 92.15% accuracy in obstruction detection
Detects information manipulation with 82.46% accuracy
Operates with low latency for obstruction detection (533 ms)
Abstract
In Augmented Reality (AR), virtual content enhances user experience by providing additional information. However, improperly positioned or designed virtual content can be detrimental to task performance, as it can impair users' ability to accurately interpret real-world information. In this paper we examine two types of task-detrimental virtual content: obstruction attacks, in which virtual content prevents users from seeing real-world objects, and information manipulation attacks, in which virtual content interferes with users' ability to accurately interpret real-world information. We provide a mathematical framework to characterize these attacks and create a custom open-source dataset for attack evaluation. To address these attacks, we introduce ViDDAR (Vision language model-based Task-Detrimental content Detector for Augmented Reality), a comprehensive full-reference system that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Automated Systems · Augmented Reality Applications
