Instruction Following with Goal-Conditioned Reinforcement Learning in Virtual Environments
Zoya Volovikova, Alexey Skrynnik, Petr Kuderov, Aleksandr I. Panov

TL;DR
This paper presents a hierarchical framework combining large language models and reinforcement learning to enable virtual agents to understand and execute complex language instructions across different virtual environments.
Contribution
We introduce a novel hierarchical approach that integrates language comprehension with reinforcement learning for goal-conditioned instruction following in virtual settings.
Findings
Effective in IGLU environment for structure building
Successful task execution in Crafter environment
Demonstrates generalizability across environments
Abstract
In this study, we address the issue of enabling an artificial intelligence agent to execute complex language instructions within virtual environments. In our framework, we assume that these instructions involve intricate linguistic structures and multiple interdependent tasks that must be navigated successfully to achieve the desired outcomes. To effectively manage these complexities, we propose a hierarchical framework that combines the deep language comprehension of large language models with the adaptive action-execution capabilities of reinforcement learning agents. The language module (based on LLM) translates the language instruction into a high-level action plan, which is then executed by a pre-trained reinforcement learning agent. We have demonstrated the effectiveness of our approach in two different environments: in IGLU, where agents are instructed to build structures, and in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVirtual Reality Applications and Impacts
