REINFORCE-ING Chemical Language Models for Drug Discovery

Morgan Thomas; Albert Bou; Jose Carlos G\'omez-Tamayo; Gary Tresadern; Mazen Ahmad; Gianni De Fabritiis

arXiv:2501.15971·cs.LG·November 6, 2025

REINFORCE-ING Chemical Language Models for Drug Discovery

Morgan Thomas, Albert Bou, Jose Carlos G\'omez-Tamayo, Gary Tresadern, Mazen Ahmad, Gianni De Fabritiis

PDF

Open Access

TL;DR

This paper investigates the application of reinforcement learning, specifically REINFORCE, to chemical language models for drug discovery, proposing new methods and best practices to improve efficiency and effectiveness.

Contribution

It introduces a new regularization method aligned with REINFORCE, explores hyperparameter tuning, and demonstrates improved drug discovery performance using RL in chemical language models.

Findings

01

Enhanced learning efficiency on binding affinity models

02

Proposed regularization method improves RL training stability

03

RL hyperparameter tuning boosts drug discovery effectiveness

Abstract

Chemical language models, combined with reinforcement learning (RL), have shown significant promise to efficiently traverse large chemical spaces for drug discovery. However, the performance of various RL algorithms and their best practices for practical drug discovery are still unclear. Here, starting from the principles of the REINFORCE algorithm, we investigate the effect of different components from RL theory including experience replay, hill-climbing, baselines to reduce variance, and alternative reward shaping. We propose a new regularization method more aligned to REINFORCE than current standard practices, and demonstrate how RL hyperparameters can be fine-tuned for effectiveness and efficiency. Lastly, we apply our learnings to practical drug discovery by demonstrating enhanced learning efficiency on frontier binding affinity models by using Boltz2 as a reward model. We share…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational Drug Discovery Methods

MethodsREINFORCE