RL Boltzmann Generators for Conformer Generation in Data-Sparse   Environments

Yash Patel; Ambuj Tewari

arXiv:2211.10771·q-bio.QM·November 22, 2022·1 cites

RL Boltzmann Generators for Conformer Generation in Data-Sparse Environments

Yash Patel, Ambuj Tewari

PDF

Open Access 1 Repo

TL;DR

This paper explores reinforcement learning Boltzmann generators for conformer generation in data-sparse environments like intrinsically disordered proteins, highlighting challenges in conformer coverage and training strategies.

Contribution

It introduces an RL-based Boltzmann generator trained on a Gibbs score, revealing limitations in conformer coverage and the independence of training efficacy from the modeling modality.

Findings

01

Training against the Gibbs score does not improve conformer coverage.

02

Energy-based training alone is insufficient for IDPs.

03

Mode collapse remains a challenge in conformer generation.

Abstract

The generation of conformers has been a long-standing interest to structural chemists and biologists alike. A subset of proteins known as intrinsically disordered proteins (IDPs) fail to exhibit a fixed structure and, therefore, must also be studied in this light of conformer generation. Unlike in the small molecule setting, ground truth data are sparse in the IDP setting, undermining many existing conformer generation methods that rely on such data for training. Boltzmann generators, trained solely on the energy function, serve as an alternative but display a mode collapse that similarly preclude their direct application to IDPs. We investigate the potential of training an RL Boltzmann generator against a closely related "Gibbs score," and demonstrate that conformer coverage does not track well with such training. This suggests that the inadequacy of solely training against the energy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yashpatel5400/clean_idp_rl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProtein Structure and Dynamics · Machine Learning in Materials Science · Gaussian Processes and Bayesian Inference

Methodsfail