Let's Give a Voice to Conversational Agents in Virtual Reality
Michele Yin, Gabriel Roccabruna, Abhinav Azad, Giuseppe Riccardi

TL;DR
This paper introduces an open-source architecture that simplifies creating voice-enabled conversational agents in virtual reality, enhancing immersive interaction experiences across various domains.
Contribution
It provides a flexible, plug-and-play framework for developing multimodal conversational agents in virtual environments, supporting custom speech models and domain adaptation.
Findings
Developed two digital health conversational prototypes in Unity
Enabled voice-based interaction in VR and non-immersive displays
Facilitated easy integration of different domain-specific agents
Abstract
The dialogue experience with conversational agents can be greatly enhanced with multimodal and immersive interactions in virtual reality. In this work, we present an open-source architecture with the goal of simplifying the development of conversational agents operating in virtual environments. The architecture offers the possibility of plugging in conversational agents of different domains and adding custom or cloud-based Speech-To-Text and Text-To-Speech models to make the interaction voice-based. Using this architecture, we present two conversational prototypes operating in the digital health domain developed in Unity for both non-immersive displays and VR headsets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · AI in Service Interactions
