Active Learning-Guided Seq2Seq Variational Autoencoder for Multi-target Inhibitor Generation

J\'ulia Vilalta-Mor; Alexis Molina; Laura Ortega Varga; Isaac Filella-Merce; Victor Guallar

arXiv:2506.15309·cs.LG·June 24, 2025

Active Learning-Guided Seq2Seq Variational Autoencoder for Multi-target Inhibitor Generation

J\'ulia Vilalta-Mor, Alexis Molina, Laura Ortega Varga, Isaac Filella-Merce, Victor Guallar

PDF

TL;DR

This paper introduces an active learning-guided Seq2Seq VAE framework for multi-target drug design, effectively balancing chemical diversity, molecular quality, and target affinity to generate diverse inhibitors for coronavirus proteases.

Contribution

It presents a novel iterative active learning approach integrating Seq2Seq VAE for multi-target molecule generation, improving exploration of complex chemical spaces.

Findings

01

Successfully generated diverse pan-inhibitor candidates for coronavirus proteases.

02

Enhanced chemical space exploration through strategic filtering within the active learning pipeline.

03

Transforming sparse-reward, multi-objective drug design into an efficient computational process.

Abstract

Simultaneously optimizing molecules against multiple therapeutic targets remains a profound challenge in drug discovery, particularly due to sparse rewards and conflicting design constraints. We propose a structured active learning (AL) paradigm integrating a sequence-to-sequence (Seq2Seq) variational autoencoder (VAE) into iterative loops designed to balance chemical diversity, molecular quality, and multi-target affinity. Our method alternates between expanding chemically feasible regions of latent space and progressively constraining molecules based on increasingly stringent multi-target docking thresholds. In a proof-of-concept study targeting three related coronavirus main proteases (SARS-CoV-2, SARS-CoV, MERS-CoV), our approach efficiently generated a structurally diverse set of pan-inhibitor candidates. We demonstrate that careful timing and strategic placement of chemical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSparse Evolutionary Training