Multi-Objective Reinforcement Learning for Generating Covalent Inhibitor Candidates

Renee Gil

arXiv:2604.20019·cs.LG·April 23, 2026

Multi-Objective Reinforcement Learning for Generating Covalent Inhibitor Candidates

Renee Gil

PDF

TL;DR

This paper introduces a multi-objective reinforcement learning pipeline for generating covalent inhibitor candidates, successfully rediscovering known structures and exploring novel warhead motifs beyond training data.

Contribution

The study presents a novel RL-based generative approach that balances multiple properties and discovers new covalent warheads not present in training data.

Findings

01

Rediscovers known covalent inhibitors at rates up to 0.74%.

02

Generates structures with warhead-to-residue distances as short as 3.2 Å.

03

Spontaneously produces novel covalent warhead motifs supported by literature.

Abstract

Rational design of covalent inhibitors requires simultaneously optimizing multiple properties, such as binding affinity, target selectivity, or electrophilic reactivity. This presents a multi-objective problem not easily addressed by screening alone. Here we present a machine learning pipeline for generating covalent inhibitor candidates using multi-objective reinforcement learning (RL), applied to two targets: epidermal growth factor receptor (EGFR) and acetylcholinesterase (ACHE). A SMILES-based pretrained LSTM serves as the generative model, optimized via policy gradient RL with Pareto crowding distance to balance competing scoring functions including synthetic accessibility, predicted covalent activity, residue affinity, and an approximated docking score. The pipeline rediscovers known covalent inhibitors at rates of up to 0.50% (EGFR) and 0.74% (ACHE) in 10,000-structure runs, with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.