Boltzmann Machine Learning with a Parallel, Persistent Markov chain Monte Carlo method for Estimating Evolutionary Fields and Couplings from a Protein Multiple Sequence Alignment

Sanzo Miyazawa

arXiv:2604.18022·q-bio.BM·April 21, 2026

Boltzmann Machine Learning with a Parallel, Persistent Markov chain Monte Carlo method for Estimating Evolutionary Fields and Couplings from a Protein Multiple Sequence Alignment

Sanzo Miyazawa

PDF

TL;DR

This paper introduces a parallel, persistent Markov chain Monte Carlo method combined with stochastic gradient descent to efficiently estimate evolutionary fields and couplings in proteins from sequence alignments, improving contact prediction accuracy.

Contribution

It presents a novel, computationally efficient approach for estimating evolutionary parameters in proteins using a parallel MCMC and tailored hyperparameter adjustment.

Findings

01

Applied to eight protein families with successful results.

02

Improved contact residue pair prediction accuracy.

03

Reduced computational time for Boltzmann machine learning.

Abstract

The inverse Potts problem for estimating evolutionary single-site fields and pairwise couplings in homologous protein sequences from their single-site and pairwise amino acid frequencies observed in their multiple sequence alignment would be still one of useful methods in the studies of protein structure and evolution. Since the reproducibility of fields and couplings are the most important, the Boltzmann machine method is employed here, although it is computationally intensive. In order to reduce computational time required for the Boltzmann machine, parallel, persistent Markov chain Monte Carlo method is employed to estimate the single-site and pairwise marginal distributions in each learning step. Also, stochastic gradient descent methods are used to reduce computational time for each learning. Another problem is how to adjust the values of hyperparameters; there are two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.