Learning Extrapolative Sequence Transformations from Markov Chains

Sophia Hager; Aleem Khan; Andrew Wang; and Nicholas Andrews

arXiv:2505.20251·cs.LG·May 27, 2025

Learning Extrapolative Sequence Transformations from Markov Chains

Sophia Hager, Aleem Khan, Andrew Wang, and Nicholas Andrews

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a learned autoregressive model trained on Markov chain data to efficiently extrapolate sequence properties, outperforming traditional MCMC in scalability and sample efficiency across biological and text applications.

Contribution

It presents a novel method to learn an autoregressive model from Markov chain data, enabling effective extrapolation of sequence properties with fewer steps and greater efficiency.

Findings

01

Model outperforms MCMC in extrapolation tasks

02

Achieves higher sample efficiency and scalability

03

Validated on protein design and text tasks

Abstract

Most successful applications of deep learning involve similar training and test conditions. However, tasks such as biological sequence design involve searching for sequences that improve desirable properties beyond previously known values, which requires novel hypotheses that \emph{extrapolate} beyond training data. In these settings, extrapolation may be achieved by using random search methods such as Markov chain Monte Carlo (MCMC), which, given an initial state, sample local transformations to approximate a target density that rewards states with the desired properties. However, even with a well-designed proposal, MCMC may struggle to explore large structured state spaces efficiently. Rather than relying on stochastic search, it would be desirable to have a model that greedily optimizes the properties of interest, successfully extrapolating in as few steps as possible. We propose to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sophia-hager/learning-mcmc-extrapolation
noneOfficial

Videos

Learning Extrapolative Sequence Transformations from Markov Chains· slideslive

Taxonomy

TopicsNeural Networks and Applications

MethodsRandom Search