Vocal Sandbox: Continual Learning and Adaptation for Situated Human-Robot Collaboration
Jennifer Grannen, Siddharth Karamcheti, Suvir Mirchandani, Percy, Liang, Dorsa Sadigh

TL;DR
Vocal Sandbox is a framework that enables robots to learn and adapt continually through multi-modal human teaching, improving collaboration efficiency and complexity in situated environments.
Contribution
The paper introduces a novel, interpretable learning framework allowing real-time multi-level human-robot teaching and adaptation in situated tasks.
Findings
Reduced active supervision by 22.1% compared to baselines
Achieved 19.7% more complex autonomous behaviors
Users rated the system's ease of use 20.6% higher
Abstract
We introduce Vocal Sandbox, a framework for enabling seamless human-robot collaboration in situated environments. Systems in our framework are characterized by their ability to adapt and continually learn at multiple levels of abstraction from diverse teaching modalities such as spoken dialogue, object keypoints, and kinesthetic demonstrations. To enable such adaptation, we design lightweight and interpretable learning algorithms that allow users to build an understanding and co-adapt to a robot's capabilities in real-time, as they teach new behaviors. For example, after demonstrating a new low-level skill for "tracking around" an object, users are provided with trajectory visualizations of the robot's intended motion when asked to track a new object. Similarly, users teach high-level planning behaviors through spoken dialogue, using pretrained language models to synthesize behaviors…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModular Robots and Swarm Intelligence · Social Robot Interaction and HRI · Reinforcement Learning in Robotics
