Self-Supervised Visuo-Tactile Pretraining to Locate and Follow Garment   Features

Justin Kerr; Huang Huang; Albert Wilcox; Ryan Hoque; Jeffrey; Ichnowski; Roberto Calandra; and Ken Goldberg

arXiv:2209.13042·cs.RO·August 1, 2023

Self-Supervised Visuo-Tactile Pretraining to Locate and Follow Garment Features

Justin Kerr, Huang Huang, Albert Wilcox, Ryan Hoque, Jeffrey, Ichnowski, Roberto Calandra, and Ken Goldberg

PDF

Open Access

TL;DR

This paper introduces a self-supervised framework for learning visuo-tactile representations that enable robots to perform various garment manipulation tasks with high success rates, reducing reliance on labeled datasets.

Contribution

The authors propose SSVTP, a novel self-supervised pretraining method that aligns visual and tactile data in a shared space for improved deformable object manipulation.

Findings

01

Achieved 73-100% success rate across five tasks

02

Enabled cross-modal perception without fine-tuning

03

Demonstrated effective feature localization and anomaly detection

Abstract

Humans make extensive use of vision and touch as complementary senses, with vision providing global information about the scene and touch measuring local information during manipulation without suffering from occlusions. While prior work demonstrates the efficacy of tactile sensing for precise manipulation of deformables, they typically rely on supervised, human-labeled datasets. We propose Self-Supervised Visuo-Tactile Pretraining (SSVTP), a framework for learning multi-task visuo-tactile representations in a self-supervised manner through cross-modal supervision. We design a mechanism that enables a robot to autonomously collect precisely spatially-aligned visual and tactile image pairs, then train visual and tactile encoders to embed these pairs into a shared latent space using cross-modal contrastive loss. We apply this latent space to downstream perception and control of deformable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTactile and Sensory Interactions · Advanced Sensor and Energy Harvesting Materials · Interactive and Immersive Displays