NatSGLD: A Dataset with Speech, Gesture, Logic, and Demonstration for Robot Learning in Natural Human-Robot Interaction
Snehesh Shrestha, Yantian Zha, Saketh Banagiri, Ge Gao, Yiannis, Aloimonos, Cornelia Ferm\"uller

TL;DR
The paper introduces NatSGLD, a comprehensive multimodal dataset with speech, gestures, demonstrations, and logical annotations, designed to enhance complex human-robot interaction research.
Contribution
It provides a novel dataset with multimodal commands, demonstrations, and formal task annotations, addressing limitations of existing simpler datasets for advanced HRI tasks.
Findings
Enables research in multimodal instruction following
Supports plan recognition and reinforcement learning
Provides detailed annotations for complex HRI tasks
Abstract
Recent advances in multimodal Human-Robot Interaction (HRI) datasets emphasize the integration of speech and gestures, allowing robots to absorb explicit knowledge and tacit understanding. However, existing datasets primarily focus on elementary tasks like object pointing and pushing, limiting their applicability to complex domains. They prioritize simpler human command data but place less emphasis on training robots to correctly interpret tasks and respond appropriately. To address these gaps, we present the NatSGLD dataset, which was collected using a Wizard of Oz (WoZ) method, where participants interacted with a robot they believed to be autonomous. NatSGLD records humans' multimodal commands (speech and gestures), each paired with a demonstration trajectory and a Linear Temporal Logic (LTL) formula that provides a ground-truth interpretation of the commanded tasks. This dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Robotics and Automated Systems
MethodsWizard: Unsupervised goats tracking algorithm · Focus
