Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning
Nozomu Masuya, Hiroshi Sato, Koki Yamane, Takuya Kusume, Sho Sakaino,, Toshiaki Tsuji

TL;DR
This paper introduces a novel real-world data augmentation method using variable-speed teaching-playback for imitation learning, enhancing data diversity and success rates in robot manipulation tasks involving force control.
Contribution
It proposes a new data augmentation technique applicable to force control that preserves real-world data advantages, improving imitation learning performance.
Findings
55% increase in success rate with variable-speed augmentation
Improved accuracy in environmental reactions at different speeds
Enhanced data quantity and quality for imitation learning
Abstract
Because imitation learning relies on human demonstrations in hard-to-simulate settings, the inclusion of force control in this method has resulted in a shortage of training data, even with a simple change in speed. Although the field of data augmentation has addressed the lack of data, conventional methods of data augmentation for robot manipulation are limited to simulation-based methods or downsampling for position control. This paper proposes a novel method of data augmentation that is applicable to force control and preserves the advantages of real-world datasets. We applied teaching-playback at variable speeds as real-world data augmentation to increase both the quantity and quality of environmental reactions at variable speeds. An experiment was conducted on bilateral control-based imitation learning using a method of imitation learning equipped with position-force control. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Neural Networks and Applications · Model Reduction and Neural Networks
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
