A Boltzmann-machine-enhanced Transformer For DNA Sequence Classification
Zhixuan Cao, Yishu Xu, Xuang WU

TL;DR
This paper introduces a novel Transformer model enhanced with Boltzmann machines for DNA sequence classification, enabling explicit structure discovery and modeling higher-order dependencies.
Contribution
It integrates Boltzmann-style structured binary gating into Transformers, using variational inference and Gumbel-Softmax for differentiable discrete structure learning.
Findings
Model captures latent site interactions and higher-order dependencies.
Joint training improves classification accuracy and interpretability.
Framework unifies Boltzmann machines, discrete optimization, and Transformers.
Abstract
DNA sequence classification requires not only high predictive accuracy but also the ability to uncover latent site interactions, combinatorial regulation, and epistasis-like higher-order dependencies. Although the standard Transformer provides strong global modeling capacity, its softmax attention is continuous, dense, and weakly constrained, making it better suited for information routing than explicit structure discovery. In this paper, we propose a Boltzmann-machine-enhanced Transformer for DNA sequence classification. Built on multi-head attention, the model introduces structured binary gating variables to represent latent query-key connections and constrains them with a Boltzmann-style energy function. Query-key similarity defines local bias terms, learnable pairwise interactions capture synergy and competition between edges, and latent hidden units model higher-order combinatorial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
