Learning to Solve Voxel Building Embodied Tasks from Pixels and Natural   Language Instructions

Alexey Skrynnik; Zoya Volovikova; Marc-Alexandre C\^ot\'e; Anton; Voronov; Artem Zholus; Negar Arabzadeh; Shrestha Mohanty; Milagro Teruel,; Ahmed Awadallah; Aleksandr Panov; Mikhail Burtsev; Julia Kiseleva

arXiv:2211.00688·cs.AI·November 3, 2022·1 cites

Learning to Solve Voxel Building Embodied Tasks from Pixels and Natural Language Instructions

Alexey Skrynnik, Zoya Volovikova, Marc-Alexandre C\^ot\'e, Anton, Voronov, Artem Zholus, Negar Arabzadeh, Shrestha Mohanty, Milagro Teruel,, Ahmed Awadallah, Aleksandr Panov, Mikhail Burtsev, Julia Kiseleva

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel approach combining language models and reinforcement learning to enable embodied agents to build objects in a Minecraft-like environment based on natural language instructions, verifying action feasibility and relevance.

Contribution

It presents a new method that generates achievable sub-goals from instructions and completes sub-tasks with a pre-trained RL policy, advancing embodied task learning.

Findings

01

Formed the RL baseline at IGLU 2022 competition

02

Effectively generates achievable sub-goals from natural language instructions

03

Successfully completes sub-tasks with pre-trained RL policy

Abstract

The adoption of pre-trained language models to generate action plans for embodied agents is a promising research strategy. However, execution of instructions in real or simulated environments requires verification of the feasibility of actions as well as their relevance to the completion of a goal. We propose a new method that combines a language model and reinforcement learning for the task of building objects in a Minecraft-like environment according to the natural language instructions. Our method first generates a set of consistently achievable sub-goals from the instructions and then completes associated sub-tasks with a pre-trained RL policy. The proposed method formed the RL baseline at the IGLU 2022 competition.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

iglu-contest/nlp-baselines-2022
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Reinforcement Learning in Robotics