Ligand unbinding pathway and mechanism analysis assisted by machine learning and graph methods
Simon Bray, Victor T\"anzel, Steffen Wolf

TL;DR
This paper introduces two machine learning and graph-based methods to analyze protein-ligand unbinding pathways from biased simulations, improving path identification and clustering accuracy.
Contribution
It presents novel clustering approaches combining contact PCA with machine learning and graph algorithms to better understand unbinding mechanisms.
Findings
Contact PCA combined with machine learning effectively clusters unbinding trajectories.
Neighbor-net algorithm outperforms dendrograms in biased data clustering.
Reaction coordinate detection remains challenging in complex unbinding cases.
Abstract
We present two methods to reveal protein-ligand unbinding mechanisms in biased unbinding simulations by clustering trajectories into ensembles representing unbinding paths. The first approach is based on a contact principal component analysis for reducing the dimensionality of the input data, followed by identification of unbinding paths and training a machine learning model for trajectory clustering. The second approach clusters trajectories according to their pairwise mean Euclidean distance employing the neighbor-net algorithm, which takes into account input data bias in the distances set and is superior to dendrogram construction. Finally, we describe a more complex case where the reaction coordinate relevant for path identification is a single intra-ligand hydrogen bond, highlighting the challenges involved in unbinding path reaction coordinate detection.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Protein Structure and Dynamics · Microbial Metabolic Engineering and Bioproduction
