A review of mathematical representations of biomolecules
Duc D Nguyen, Zixuan Cang, and Guo-Wei Wei

TL;DR
This review discusses recent mathematical methods like algebraic topology, differential geometry, and graph theory for representing biomolecules, aiming to improve machine learning predictions in computational biology.
Contribution
It provides a comprehensive overview of low-dimensional, scalable mathematical representations of biomolecules, highlighting their development and application in ML-based predictions.
Findings
Mathematical representations improve ML efficiency in biomolecular data analysis.
Algebraic topology, differential geometry, and graph theory are key approaches.
These methods enhance protein-ligand binding prediction accuracy.
Abstract
Recently, machine learning (ML) has established itself in various worldwide benchmarking competitions in computational biology, including Critical Assessment of Structure Prediction (CASP) and Drug Design Data Resource (D3R) Grand Challenges. However, the intricate structural complexity and high ML dimensionality of biomolecular datasets obstruct the efficient application of ML algorithms in the field. In addition to data and algorithm, an efficient ML machinery for biomolecular predictions must include structural representation as an indispensable component. Mathematical representations that simplify the biomolecular structural complexity and reduce ML dimensionality have emerged as a prime winner in D3R Grand Challenges. This review is devoted to the recent advances in developing low-dimensional and scalable mathematical representations of biomolecules in our laboratory. We discuss…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
