Protein folding classes -- High-dimensional geometry of amino acid composition space revisited
Boryeu Mao

TL;DR
This paper models protein structure classes using precise high-dimensional geometric constructs, providing a detailed, non-statistical understanding of amino acid composition distributions to aid in protein classification and prediction.
Contribution
It introduces a novel geometric modeling approach for protein classes that is exact, comprehensive, and decoupled from predictive models, improving understanding of amino acid composition space.
Findings
Exact geometric models of protein classes in high-dimensional space
Complete data analysis without training set biases
Potential applications in validating structure predictions
Abstract
In this study, the distributions of protein structure classes (or folding types) of experimentally determined structures from a legacy dataset and a comprehensive database (SCOP) are modeled precisely with geometric constructs such as convex polytopes in high-dimensional amino acid composition space. This is a follow-up of a previous non-statistical, geometry-motivated modeling of protein classes with ellipsoidal models, which is superseded presently in three important respects: (1) as a paradigm shift a descriptive 'distribution model' of experimental data is de-coupled from, and serves as the basis for, a possible future predictive 'domain model' generalizable to proteins in the same class for which 3D structures have yet to be determined experimentally, (2) the geometric and analytic characteristics of class distributions are obtained via exact computational geometry calculations,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Machine Learning in Bioinformatics · Enzyme Structure and Function
MethodsSparse Evolutionary Training
