Token-Level Guided Discrete Diffusion for Membrane Protein Design
Shrey Goel, Peregrine M. Schray, Yinuo Zhang, Sophia Vincoff, Huong T. Kratochvil, Pranam Chatterjee

TL;DR
This paper introduces MemDLM, a diffusion model for membrane protein design that enables controllable, high-quality sequence generation validated by experimental assays, advancing rational membrane protein engineering.
Contribution
The paper presents MemDLM, the first diffusion-based model for membrane proteins, with a novel guidance method for targeted residue control and experimental validation of designed sequences.
Findings
MemDLM achieves comparable biological plausibility to natural proteins.
Per-Token Guidance improves control over membrane protein features.
Designed sequences successfully insert into membranes in assays.
Abstract
Reparameterized diffusion models (RDMs) have recently matched autoregressive methods in protein generation, motivating their use for challenging tasks such as designing membrane proteins, which possess interleaved soluble and transmembrane (TM) regions. We introduce the Membrane Diffusion Language Model (MemDLM), a fine-tuned RDM-based protein language model that enables controllable membrane protein sequence design. MemDLM-generated sequences recapitulate the TM residue density and structural features of natural membrane proteins, achieving comparable biological plausibility and outperforming state-of-the-art diffusion baselines in motif scaffolding tasks by producing lower perplexity, higher BLOSUM-62 scores, and improved pLDDT confidence. To enhance controllability, we develop Per-Token Guidance (PET), a novel classifier-guided sampling strategy that selectively solubilizes residues…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
1. PET’s integration of attention-weighted neighborhoods and saliency scores for token-level control is a creative combination of classifier guidance and structural biology insights, addressing the unique need to preserve TM domains. 2. Experiments are solid, with both computational benchmarks (covering generation, scaffolding, solubilization) and wet-lab validation (TOXCAT assays) that confirm biological functionality, not just structural plausibility. 3. MemDLM has the potential to fill a crit
- As shown in section B.6.1, the TOXCAT assay validates only 5 designs (3 GoodTM, 2 PoorTM) with short sequences (~20–30 residues), focusing on single-pass TM helices. No validation of larger, multi-pass membrane proteins (common in biology) or functional assays (e.g., ligand binding, signaling) is provided, leaving MemDLM’s utility for real-world therapeutic design unproven. - MemDLM is a fine-tuned variant of EvoFlow (an existing RDM-based model), and its training objective is directly adopted
1. **The problem is well-motivated and addresses an important gap.** Membrane proteins constitute only approximately 1% of known protein structures despite their therapeutic importance, and the authors clearly articulate why this creates challenges for computational design methods. The motivation for developing generative models that can handle the unique characteristics of membrane proteins (interleaved transmembrane and soluble regions) is sound and represents a genuine need in the field. 2.
1. **The PET algorithm relies on unjustified assumptions about attention matrices.** The authors extract "the attention matrix" from the final transformer layer and claim it captures long-range residue information, yet provide no empirical validation of this assumption. With multi-head attention, the method for obtaining a single matrix (averaging across heads, selecting one head, or another approach) is not specified. More critically, the authors do not demonstrate that the attention matrix the
The work successfully adapts and applies a diffusion-based framework to the challenge of de novo membrane protein design, addressing a notable gap for recent generative models. Crucially, this computational contribution is substantiated by experimental validation, as the authors provide wet-lab results demonstrating that their generated single-pass proteins can successfully insert into bacterial membranes.
While the paper presents promising results and valuable experimental validation, the clarity and justification of its methodological contributions could be significantly improved. Several key details and experiments are missing, which makes it difficult to fully assess the novelty and effectiveness of the proposed techniques. The primary weaknesses are detailed below. #### 1. Lack of methodological clarity and novelty The paper's core contributions are not clearly distinguished from prior work
1. A novel self-planning algorithm is introduced for diffusion language modeling, enabling the generation of realistic membrane-like protein sequences. 2. Per-Token Guidance, a new classifier-guided sampling algorithm, is proposed to generate sequences with desired properties. 3. Wet-lab experiments were performed to validate the designed proteins’ properties.
1. The algorithmic novelty is relatively limited. 2. Writing issue: The term *Per-Token Guidance* appears multiple times with inconsistent capitalization or highlighting. 3. Baseline comparison: The baseline methods used for comparison are relatively limited.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · Machine Learning in Bioinformatics · Microbial Metabolic Engineering and Bioproduction
MethodsInpainting · Diffusion
