EAGER: Asking and Answering Questions for Automatic Reward Shaping in   Language-guided RL

Thomas Carta; Pierre-Yves Oudeyer; Olivier Sigaud; Sylvain; Lamprier

arXiv:2206.09674·cs.CL·October 14, 2022·5 cites

EAGER: Asking and Answering Questions for Automatic Reward Shaping in Language-guided RL

Thomas Carta, Pierre-Yves Oudeyer, Olivier Sigaud, Sylvain, Lamprier

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces EAGER, a method that uses question generation and answering to automatically shape rewards in language-guided reinforcement learning, improving sample efficiency without manual auxiliary design.

Contribution

The paper presents an automated reward shaping technique leveraging QG and QA systems to extract auxiliary objectives from language goals in RL.

Findings

01

Improves sample efficiency in language-conditioned RL tasks.

02

Does not require manual design of auxiliary objectives.

03

Enhances exploration by guiding the agent with intrinsic rewards.

Abstract

Reinforcement learning (RL) in long horizon and sparse reward tasks is notoriously difficult and requires a lot of training steps. A standard solution to speed up the process is to leverage additional reward signals, shaping it to better guide the learning process. In the context of language-conditioned RL, the abstraction and generalisation properties of the language input provide opportunities for more efficient ways of shaping the reward. In this paper, we leverage this idea and propose an automated reward shaping method where the agent extracts auxiliary objectives from the general language goal. These auxiliary objectives use a question generation (QG) and question answering (QA) system: they consist of questions leading the agent to try to reconstruct partial information about the global goal using its own trajectory. When it succeeds, it receives an intrinsic reward proportional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

flowersteam/eager
pytorchOfficial

Videos

EAGER: Asking and Answering Questions for Automatic Reward Shaping in Language-guided RL· slideslive

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings