A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction
Elena Sibirtseva, Dimosthenis Kontogiorgos, Olov Nykvist, Hakan, Karaoguz, Iolanda Leite, Joakim Gustafson, Danica Kragic

TL;DR
This study compares three visualisation methods for disambiguating verbal requests in human-robot interaction, focusing on accuracy, engagement, and user preference in real-time workspace augmentation.
Contribution
It introduces and empirically evaluates three visualisation techniques for reference disambiguation in a human-robot interaction context.
Findings
Significant differences in accuracy and engagement across conditions
No significant difference in task completion time
Participants preferred augmented reality over other methods
Abstract
Picking up objects requested by a human user is a common task in human-robot interaction. When multiple objects match the user's verbal description, the robot needs to clarify which object the user is referring to before executing the action. Previous research has focused on perceiving user's multimodal behaviour to complement verbal commands or minimising the number of follow up questions to reduce task time. In this paper, we propose a system for reference disambiguation based on visualisation and compare three methods to disambiguate natural language instructions. In a controlled experiment with a YuMi robot, we investigated real-time augmentations of the workspace in three conditions -- mixed reality, augmented reality, and a monitor as the baseline -- using objective measures such as time and accuracy, and subjective measures like engagement, immersion, and display interference.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
