TL;DR
This paper introduces a correlation-based feature selection method using linear correlation and Leiden clustering to identify functional dynamics in proteins, improving the interpretation of molecular simulations by filtering relevant motions.
Contribution
It presents a novel approach combining linear correlation and Leiden clustering to effectively select relevant features for analyzing protein dynamics.
Findings
Linear correlation and Leiden clustering outperform other methods in identifying functional motions.
The method successfully distinguishes collective motions in T4 lysozyme.
It provides physical interpretation of correlated motions in protein folding.
Abstract
To interpret molecular dynamics simulations of biomolecular systems, systematic dimensionality reduction methods are commonly employed. Among others, this includes principal component analysis (PCA) and time-lagged independent component analysis (TICA), which aim to maximize the variance and the timescale of the first components, respectively. A crucial first step of such an analysis is the identification of suitable and relevant input coordinates (the so-called features), such as backbone dihedral angles and interresidue distances. As typically only a small subset of those coordinates is involved in a specific biomolecular process, it is important to discard the remaining uncorrelated motions or weakly correlated noise coordinates. This is because they may exhibit large amplitudes or long timescales and therefore will be erroneously be considered important by PCA and TICA,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPrincipal Components Analysis
