That Sounds Right: Auditory Self-Supervision for Dynamic Robot   Manipulation

Abitha Thankaraj; Lerrel Pinto

arXiv:2210.01116·cs.RO·October 4, 2022·1 cites

That Sounds Right: Auditory Self-Supervision for Dynamic Robot Manipulation

Abitha Thankaraj, Lerrel Pinto

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel approach for dynamic robot manipulation using sound data, demonstrating that self-supervised pretraining on audio significantly improves behavior prediction and execution compared to traditional visual or tactile methods.

Contribution

The work pioneers the use of sound as a primary data source for dynamic manipulation and shows that self-supervised learning on audio enhances robot performance.

Findings

01

Self-supervised pretraining reduces MSE by 34.5% over supervised learning.

02

Audio-based models outperform visual models with 54.3% lower MSE.

03

Robots achieve 11.5% better performance in dynamic tasks using sound-driven models.

Abstract

Learning to produce contact-rich, dynamic behaviors from raw sensory data has been a longstanding challenge in robotics. Prominent approaches primarily focus on using visual or tactile sensing, where unfortunately one fails to capture high-frequency interaction, while the other can be too delicate for large-scale data collection. In this work, we propose a data-centric approach to dynamic manipulation that uses an often ignored source of information: sound. We first collect a dataset of 25k interaction-sound pairs across five dynamic tasks using commodity contact microphones. Then, given this data, we leverage self-supervised learning to accelerate behavior prediction from sound. Our experiments indicate that this self-supervised 'pretraining' is crucial to achieving high performance, with a 34.5% lower MSE than plain supervised learning and a 54.3% lower MSE over visual training.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

abitha-thankaraj/audio-robot-learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic Technology and Sound Studies · Music and Audio Processing · Speech and Audio Processing