TL;DR
mGPfusion is a Gaussian process-based method that combines limited experimental data with molecular simulation data to accurately predict how mutations affect protein stability, aiding protein design.
Contribution
It introduces a Bayesian data fusion approach and a protein-specific Gaussian process model that effectively integrates diverse data sources for stability prediction.
Findings
Outperforms existing methods on 15 proteins
Incorporating simulation data improves accuracy
Effective with limited experimental data
Abstract
Proteins are commonly used by biochemical industry for numerous processes. Refining these proteins' properties via mutations causes stability effects as well. Accurate computational method to predict how mutations affect protein stability are necessary to facilitate efficient protein design. However, accuracy of predictive models is ultimately constrained by the limited availability of experimental data. We have developed mGPfusion, a novel Gaussian process (GP) method for predicting protein's stability changes upon single and multiple mutations. This method complements the limited experimental data with large amounts of molecular simulation data. We introduce a Bayesian data fusion model that re-calibrates the experimental and in silico data sources and then learns a predictive GP model from the combined data. Our protein-specific model requires experimental data only regarding the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
