Variable-Speed Teaching-Playback as Real-World Data Augmentation for   Imitation Learning

Nozomu Masuya; Hiroshi Sato; Koki Yamane; Takuya Kusume; Sho Sakaino,; Toshiaki Tsuji

arXiv:2412.03252·cs.RO·May 7, 2025

Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning

Nozomu Masuya, Hiroshi Sato, Koki Yamane, Takuya Kusume, Sho Sakaino,, Toshiaki Tsuji

PDF

Open Access

TL;DR

This paper introduces a novel real-world data augmentation method using variable-speed teaching-playback for imitation learning, enhancing data diversity and success rates in robot manipulation tasks involving force control.

Contribution

It proposes a new data augmentation technique applicable to force control that preserves real-world data advantages, improving imitation learning performance.

Findings

01

55% increase in success rate with variable-speed augmentation

02

Improved accuracy in environmental reactions at different speeds

03

Enhanced data quantity and quality for imitation learning

Abstract

Because imitation learning relies on human demonstrations in hard-to-simulate settings, the inclusion of force control in this method has resulted in a shortage of training data, even with a simple change in speed. Although the field of data augmentation has addressed the lack of data, conventional methods of data augmentation for robot manipulation are limited to simulation-based methods or downsampling for position control. This paper proposes a novel method of data augmentation that is applicable to force control and preserves the advantages of real-world datasets. We applied teaching-playback at variable speeds as real-world data augmentation to increase both the quantity and quality of environmental reactions at variable speeds. An experiment was conducted on bilateral control-based imitation learning using a method of imitation learning equipped with position-force control. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Neural Networks and Applications · Model Reduction and Neural Networks

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings