Visual-Tactile Cross-Modal Data Generation using Residue-Fusion GAN with   Feature-Matching and Perceptual Losses

Shaoyu Cai; Kening Zhu; Yuki Ban; Takuji Narumi

arXiv:2107.05468·cs.CV·July 13, 2021

Visual-Tactile Cross-Modal Data Generation using Residue-Fusion GAN with Feature-Matching and Perceptual Losses

Shaoyu Cai, Kening Zhu, Yuki Ban, Takuji Narumi

PDF

1 Repo

TL;DR

This paper introduces a novel deep learning framework using Residue-Fusion GAN with feature-matching and perceptual losses to generate tactile data from visual inputs, enhancing cross-modal perception for robotics.

Contribution

It proposes a new Residue-Fusion GAN architecture with specific loss functions for effective visual-to-tactile data translation, advancing cross-modal data generation techniques.

Findings

01

Improved classification accuracy with generated tactile data.

02

Enhanced visual similarity between generated and real data.

03

Significant performance gains with RF, FM, and perceptual losses.

Abstract

Existing psychophysical studies have revealed that the cross-modal visual-tactile perception is common for humans performing daily activities. However, it is still challenging to build the algorithmic mapping from one modality space to another, namely the cross-modal visual-tactile data translation/generation, which could be potentially important for robotic operation. In this paper, we propose a deep-learning-based approach for cross-modal visual-tactile data generation by leveraging the framework of the generative adversarial networks (GANs). Our approach takes the visual image of a material surface as the visual data, and the accelerometer signal induced by the pen-sliding movement on the surface as the tactile data. We adopt the conditional-GAN (cGAN) structure together with the residue-fusion (RF) module, and train the model with the additional feature-matching (FM) and perceptual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shaoyuca/Visual-Tactile-Data-Generation
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.