TL;DR
adabmDCA introduces an adaptive Boltzmann machine learning method for biological sequences, enhancing the modeling of protein and RNA families by improving contact prediction and sequence generation with flexible training options.
Contribution
It presents a versatile, adaptive implementation of Boltzmann machine learning applicable to both proteins and RNAs, with improved training efficiency and parameter pruning capabilities.
Findings
Models achieve contact map accuracy comparable to state-of-the-art methods.
Generated sequences are similar in quality to those from existing techniques.
The method efficiently handles complex data with equilibrium and out-of-equilibrium training.
Abstract
Boltzmann machines are energy-based models that have been shown to provide an accurate statistical description of domains of evolutionary-related protein and RNA families. They are parametrized in terms of local biases accounting for residue conservation, and pairwise terms to model epistatic coevolution between residues. From the model parameters, it is possible to extract an accurate prediction of the three-dimensional contact map of the target domain. More recently, the accuracy of these models has been also assessed in terms of their ability in predicting mutational effects and generating in silico functional sequences. Our adaptive implementation of Boltzmann machine learning, adabmDCA, can be generally applied to both protein and RNA families and accomplishes several learning set-ups, depending on the complexity of the input data and on the user requirements. The code is fully…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning
