Maximum Reward Formulation In Reinforcement Learning

Sai Krishna Gottipati; Yashaswi Pathak; Rohan Nuttall; Sahir; Raviteja; Chunduru; Ahmed Touati; Sriram Ganapathi Subramanian; Matthew E. Taylor,; Sarath Chandar

arXiv:2010.03744·cs.LG·December 20, 2023·6 cites

Maximum Reward Formulation In Reinforcement Learning

Sai Krishna Gottipati, Yashaswi Pathak, Rohan Nuttall, Sahir, Raviteja, Chunduru, Ahmed Touati, Sriram Ganapathi Subramanian, Matthew E. Taylor,, Sarath Chandar

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new reinforcement learning framework focused on maximizing the maximum reward in a trajectory, which is more suitable for applications like drug discovery where the best outcome matters more than the average.

Contribution

It formulates a novel objective for RL that maximizes the expected maximum reward, derives a new Bellman equation, and demonstrates state-of-the-art results in molecule generation.

Findings

01

Achieved state-of-the-art results in molecule generation.

02

Proved convergence of the new Bellman operator.

03

Demonstrated applicability to real-world drug discovery pipelines.

Abstract

Reinforcement learning (RL) algorithms typically deal with maximizing the expected cumulative return (discounted or undiscounted, finite or infinite horizon). However, several crucial applications in the real world, such as drug discovery, do not fit within this framework because an RL agent only needs to identify states (molecules) that achieve the highest reward within a trajectory and does not need to optimize for the expected cumulative return. In this work, we formulate an objective function to maximize the expected maximum reward along a trajectory, derive a novel functional form of the Bellman equation, introduce the corresponding Bellman operators, and provide a proof of convergence. Using this formulation, we achieve state-of-the-art results on the task of molecule generation that mimics a real-world drug discovery pipeline.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

99andBeyond/max-bellman-toy
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene Regulatory Network Analysis · Computational Drug Discovery Methods · Innovative Microfluidic and Catalytic Techniques Innovation