Actively Learning Costly Reward Functions for Reinforcement Learning
Andr\'e Eberhard, Houssam Metni, Georg Fahland, Alexander Stroh,, Pascal Friederich

TL;DR
This paper introduces ACRL, an active learning approach that models costly rewards with neural networks to significantly speed up reinforcement learning in real-world applications where reward evaluation is expensive.
Contribution
The paper presents ACRL, a novel method that replaces costly reward evaluations with learned models, enabling faster training in complex real-world environments.
Findings
ACRL accelerates training by orders of magnitude in real-world tasks.
It enables reinforcement learning in domains with expensive reward evaluations.
ACRL finds non-trivial solutions in chemistry, materials science, and engineering.
Abstract
Transfer of recent advances in deep reinforcement learning to real-world applications is hindered by high data demands and thus low efficiency and scalability. Through independent improvements of components such as replay buffers or more stable learning algorithms, and through massively distributed systems, training time could be reduced from several days to several hours for standard benchmark tasks. However, while rewards in simulated environments are well-defined and easy to compute, reward evaluation becomes the bottleneck in many real-world environments, e.g., in molecular optimization tasks, where computationally demanding simulations or even experiments are required to evaluate states and to quantify rewards. Therefore, training might become prohibitively expensive without an extensive amount of computational resources and time. We propose to alleviate this problem by replacing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProcess Optimization and Integration · Data Stream Mining Techniques · Software Engineering Research
