Learning Flexible Translation between Robot Actions and Language Descriptions
Ozan \"Ozdemir, Matthias Kerzel, Cornelius Weber, Jae Hee Lee, Stefan, Wermter

TL;DR
This paper introduces a novel paired gated autoencoder model enabling flexible, bidirectional translation between robot actions and natural language descriptions, supporting generalization and imitation in tabletop manipulation tasks.
Contribution
The work presents a unified end-to-end model that performs flexible, bidirectional translation between actions and language without changing architecture per task, and can incorporate pretrained language models.
Findings
Effective bidirectional translation demonstrated.
Model generalizes to unseen actions and language inputs.
Supports imitation of other agents' actions.
Abstract
Handling various robot action-language translation tasks flexibly is an essential requirement for natural interaction between a robot and a human. Previous approaches require change in the configuration of the model architecture per task during inference, which undermines the premise of multi-task learning. In this work, we propose the paired gated autoencoders (PGAE) for flexible translation between robot actions and language descriptions in a tabletop object manipulation scenario. We train our model in an end-to-end fashion by pairing each action with appropriate descriptions that contain a signal informing about the translation direction. During inference, our model can flexibly translate from action to language and vice versa according to the given language signal. Moreover, with the option to use a pretrained language model as the language encoder, our model has the potential to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling
