Random Functions as Data Compressors for Machine Learning of Molecular Processes
Jayashrita Debnath, Gerhard Hummer

TL;DR
This paper shows that random nonlinear projections can compress data in molecular simulations without losing important information, speeding up machine learning analysis.
Contribution
The novel contribution is using random nonlinear projections as efficient data compressors for molecular ML tasks.
Findings
Random projections retain core static and dynamic information in high-dimensional molecular data.
Compression improves trajectory analysis robustness for protein folding simulations.
Abstract
Machine learning (ML) is rapidly transforming the way molecular dynamics simulations are performed and analyzed from materials modeling to studies of protein folding and function. ML algorithms are often employed to learn low-dimensional representations of conformational landscapes and cluster trajectories into relevant metastable states. Most of these algorithms require the selection of a small number of features that describe the problem of interest. Although deep neural networks can tackle large numbers of input features, the training costs increase with input size, which makes the selection of a subset of features mandatory for most problems of practical interest. Here, we show that random nonlinear projections can be used to compress large feature spaces and make computations faster without a substantial loss of information. We describe an efficient way to produce random…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Protein Structure and Dynamics · Advanced Electron Microscopy Techniques and Applications
