Actively Learning Costly Reward Functions for Reinforcement Learning

Andr\'e Eberhard; Houssam Metni; Georg Fahland; Alexander Stroh,; Pascal Friederich

arXiv:2211.13260·cs.LG·November 28, 2022

Actively Learning Costly Reward Functions for Reinforcement Learning

Andr\'e Eberhard, Houssam Metni, Georg Fahland, Alexander Stroh,, Pascal Friederich

PDF

Open Access 1 Repo

TL;DR

This paper introduces ACRL, an active learning approach that models costly rewards with neural networks to significantly speed up reinforcement learning in real-world applications where reward evaluation is expensive.

Contribution

The paper presents ACRL, a novel method that replaces costly reward evaluations with learned models, enabling faster training in complex real-world environments.

Findings

01

ACRL accelerates training by orders of magnitude in real-world tasks.

02

It enables reinforcement learning in domains with expensive reward evaluations.

03

ACRL finds non-trivial solutions in chemistry, materials science, and engineering.

Abstract

Transfer of recent advances in deep reinforcement learning to real-world applications is hindered by high data demands and thus low efficiency and scalability. Through independent improvements of components such as replay buffers or more stable learning algorithms, and through massively distributed systems, training time could be reduced from several days to several hours for standard benchmark tasks. However, while rewards in simulated environments are well-defined and easy to compute, reward evaluation becomes the bottleneck in many real-world environments, e.g., in molecular optimization tasks, where computationally demanding simulations or even experiments are required to evaluate states and to quantify rewards. Therefore, training might become prohibitively expensive without an extensive amount of computational resources and time. We propose to alleviate this problem by replacing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

32af3611/ai4mat-neurips-workshop-2022
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProcess Optimization and Integration · Data Stream Mining Techniques · Software Engineering Research