Revisiting put-that-there, context aware window interactions via LLMs
Riccardo Bovo, Daniele Giunchi, Pasquale Cascarano, Eric J. Gonzalez, Mar Gonzalez-Franco

TL;DR
This paper enhances window interaction in XR environments by integrating LLMs with sensor data to enable intuitive, goal-driven, and context-aware window management through natural language and gestures.
Contribution
It introduces a novel LLM-based system that fuses environment data, application metadata, and user cues for dynamic, intent-driven window placement in XR.
Findings
Supports natural language and gesture commands for window management
Enables goal-centric reasoning for application and layout inference
Facilitates seamless, intent-driven interaction in immersive XR environments
Abstract
We revisit Bolt's classic "Put-That-There" concept for modern head-mounted displays by pairing Large Language Models (LLMs) with XR sensor and tech stack. The agent fuses (i) a semantically segmented 3-D environment, (ii) live application metadata, and (iii) users' verbal, pointing, and head-gaze cues to issue JSON window-placement actions. As a result, users can manage a panoramic workspace through: (1) explicit commands ("Place Google Maps on the coffee table"), (2) deictic speech plus gestures ("Put that there"), or (3) high-level goals ("I need to send a message"). Unlike traditional explicit interfaces, our system supports one-to-many action mappings and goal-centric reasoning, allowing the LLM to dynamically infer relevant applications and layout decisions, including interrelationships across tools. This enables seamless, intent-driven interaction without manual window juggling in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInteractive and Immersive Displays · Speech and dialogue systems · Augmented Reality Applications
