Estimating Boltzmann Averages for Protein Structural Quantities Using Sequential Monte Carlo
Zhaoran Hou, Samuel W.K. Wong

TL;DR
This paper introduces a novel Sequential Monte Carlo method that effectively samples from complex Boltzmann distributions, specifically applied to protein structures, overcoming particle degeneracy issues in high-dimensional, multimodal spaces.
Contribution
The paper presents a new SMC approach with multiple descendants and resampling, improving sampling efficiency for protein Boltzmann distributions over existing methods.
Findings
Demonstrates improved sampling of protein structures in simulations.
Successfully estimates atomic contacts in SARS-CoV-2 spike protein.
Shows robustness in highly constrained, multimodal distributions.
Abstract
Sequential Monte Carlo (SMC) methods are widely used to draw samples from intractable target distributions. Particle degeneracy can hinder the use of SMC when the target distribution is highly constrained or multimodal. As a motivating application, we consider the problem of sampling protein structures from the Boltzmann distribution. This paper proposes a general SMC method that propagates multiple descendants for each particle, followed by resampling to maintain the desired number of particles. Simulation studies demonstrate the efficacy of the method for tackling the protein sampling problem. As a real data example, we use our method to estimate the number of atomic contacts for a key segment of the SARS-CoV-2 viral spike protein.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSARS-CoV-2 detection and testing · Statistical Methods and Inference · Machine Learning in Materials Science
