Learning Bidirectional Translation between Descriptions and Actions with   Small Paired Data

Minori Toyoda; Kanata Suzuki; Yoshihiko Hayashi; Tetsuya Ogata

arXiv:2203.04218·cs.RO·September 27, 2022

Learning Bidirectional Translation between Descriptions and Actions with Small Paired Data

Minori Toyoda, Kanata Suzuki, Yoshihiko Hayashi, Tetsuya Ogata

PDF

TL;DR

This paper presents a two-stage training approach enabling bidirectional translation between descriptions and actions with limited paired data, leveraging large non-paired datasets for pre-training and fine-tuning on small paired datasets.

Contribution

It introduces a novel two-stage training method that uses large non-paired data for pre-training and small paired data for fine-tuning to achieve bidirectional translation.

Findings

01

Effective bidirectional translation with limited paired data

02

Intermediate representations cluster similar actions and descriptions

03

Model performs well even with small amounts of paired data

Abstract

This study achieved bidirectional translation between descriptions and actions using small paired data from different modalities. The ability to mutually generate descriptions and actions is essential for robots to collaborate with humans in their daily lives, which generally requires a large dataset that maintains comprehensive pairs of both modality data. However, a paired dataset is expensive to construct and difficult to collect. To address this issue, this study proposes a two-stage training method for bidirectional translation. In the proposed method, we train recurrent autoencoders (RAEs) for descriptions and actions with a large amount of non-paired data. Then, we finetune the entire model to bind their intermediate representations using small paired data. Because the data used for pre-training do not require pairing, behavior-only data or a large language corpus can be used. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsRegularized Autoencoders