EmBARDiment: an Embodied AI Agent for Productivity in XR
Riccardo Bovo, Steven Abreu, Karan Ahuja, Eric J Gonzalez, Li-Te, Cheng, Mar Gonzalez-Franco

TL;DR
This paper introduces EmBARDiment, an embodied AI agent for XR that uses implicit contextual cues like user actions and eye-gaze to enable more natural and productive interactions without relying heavily on explicit prompts.
Contribution
It presents a novel attention framework that leverages implicit user context in XR, reducing the need for explicit prompts and enhancing interaction intuitiveness.
Findings
Improved natural interaction in XR environments.
Reduced reliance on explicit prompts for AI agents.
Enhanced contextual understanding through implicit cues.
Abstract
XR devices running chat-bots powered by Large Language Models (LLMs) have the to become always-on agents that enable much better productivity scenarios. Current screen based chat-bots do not take advantage of the the full-suite of natural inputs available in XR, including inward facing sensor data, instead they over-rely on explicit voice or text prompts, sometimes paired with multi-modal data dropped as part of the query. We propose a solution that leverages an attention framework that derives context implicitly from user actions, eye-gaze, and contextual memory within the XR environment. Our work minimizes the need for engineered explicit prompts, fostering grounded and intuitive interactions that glean user insights for the chat-bot.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Scheduling and Optimization Algorithms · 3D Surveying and Cultural Heritage
