Understanding Acoustic Patterns of Human Teachers Demonstrating   Manipulation Tasks to Robots

Akanksha Saran; Kush Desai; Mai Lee Chang; Rudolf Lioutikov; Andrea; Thomaz; Scott Niekum

arXiv:2211.00352·cs.RO·November 2, 2022

Understanding Acoustic Patterns of Human Teachers Demonstrating Manipulation Tasks to Robots

Akanksha Saran, Kush Desai, Mai Lee Chang, Rudolf Lioutikov, Andrea, Thomaz, Scott Niekum

PDF

Open Access

TL;DR

This paper investigates the acoustic signals from human teachers demonstrating manipulation tasks to robots, analyzing speech features to understand the conveyed information and its potential to enhance robot learning from demonstrations.

Contribution

It characterizes the acoustic features of human demonstrations, revealing how speech conveys semantic content and expressive cues across different teaching conditions.

Findings

01

Teachers convey similar semantic concepts across conditions.

02

Speech duration and expressiveness vary with demonstration context.

03

Audio signals contain rich information beneficial for robot learning.

Abstract

Humans use audio signals in the form of spoken language or verbal reactions effectively when teaching new skills or tasks to other humans. While demonstrations allow humans to teach robots in a natural way, learning from trajectories alone does not leverage other available modalities including audio from human teachers. To effectively utilize audio cues accompanying human demonstrations, first it is important to understand what kind of information is present and conveyed by such cues. This work characterizes audio from human teachers demonstrating multi-step manipulation tasks to a situated Sawyer robot using three feature types: (1) duration of speech used, (2) expressiveness in speech or prosody, and (3) semantic content of speech. We analyze these features along four dimensions and find that teachers convey similar semantic concepts via spoken words for different conditions of (1)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems