Fast conformational clustering of extensive molecular dynamics simulation data
Simon Hunkler, Kay Diederichs, Oleksandra Kukharenko, Christine Peter

TL;DR
This paper introduces a fast, unsupervised workflow combining dimensionality reduction and density-based clustering algorithms to efficiently analyze and categorize conformations in extensive molecular dynamics simulation data.
Contribution
The study presents a novel combination of cc_analysis, encodermap, and HDBSCAN for rapid conformational clustering, including the first application of cc_analysis to molecular simulation data.
Findings
Effective clustering of diverse molecular systems
High percentage of frames assigned to meaningful clusters
Demonstrated advantages and limitations across test cases
Abstract
We present an unsupervised data processing workflow that is specifically designed to obtain a fast conformational clustering of long molecular dynamics simulation trajectories. In this approach we combine two dimensionality reduction algorithms (cc\_analysis and encodermap) with a density-based spatial clustering algorithm (HDBSCAN). The proposed scheme benefits from the strengths of the three algorithms while avoiding most of the drawbacks of the individual methods. Here the cc\_analysis algorithm is for the first time applied to molecular simulation data. Encodermap complements cc\_analysis by providing an efficient way to process and assign large amounts of data to clusters. The main goal of the procedure is to maximize the number of assigned frames of a given trajectory, while keeping a clear conformational identity of the clusters that are found. In practice we achieve this by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Enzyme Structure and Function · RNA and protein synthesis mechanisms
MethodsTest
