RadDiff: Retrieval-Augmented Denoising Diffusion for Protein Inverse Folding
Jin Han, Tianfan Fu, Wu-Jun Li

TL;DR
RadDiff is a novel retrieval-augmented diffusion model for protein inverse folding that leverages up-to-date protein knowledge, outperforming existing methods in sequence recovery and foldability across multiple datasets.
Contribution
Introduces RadDiff, a retrieval-augmented diffusion approach that incorporates current protein knowledge into inverse folding, addressing limitations of prior methods.
Findings
RadDiff outperforms existing methods with up to 19% higher sequence recovery.
RadDiff generates highly foldable sequences.
RadDiff scales effectively with database size.
Abstract
Protein inverse folding, the design of an amino acid sequence based on a target protein structure, is a fundamental problem of computational protein engineering. Existing methods either generate sequences without leveraging external knowledge or relying on protein language models~(PLMs). The former omits the knowledge stored in natural protein data, while the latter is parameter-inefficient and inflexible to adapt to ever-growing protein data. To overcome the above drawbacks, in this paper we propose a novel method, called etrieval-ugmented enoising usion~(), for protein inverse folding. In RadDiff, a novel retrieval-augmentation mechanism is designed to capture the up-to-date protein knowledge. We further design a knowledge-aware diffusion model that integrates this protein…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
1. The motivation is well-grounded. Furthermore, the integration of dynamic structural information retrieval into the model's training paradigm represents a novel and compelling approach. 2. The proposed method outperforms baseline models across various metrics on benchmark datasets. Additionally, it boasts less parameters.
1. The claim that RadDiff adapts to new data without retraining lacks comprehensive assessment, as only a single database was used. Performance should be validated across diverse databases (e.g., PDB, AlphaFold DB) or different temporal versions. 2. Performance correlates with the number and quality (TM-score) of retrieved hits. An analysis of the trade-off between these two factors is preferred. 3. Given that both RadDiff and PRISM are retrieval-augmented methods, PRISM should be included as a
1. The retrieval to alignment to profile-conditioning pipeline is a sensible way to inject evolutionary/structural signal into sequence design. 2. The manuscript is clearly structured (explicit sections on retrieval augmentation, hierarchical search, residue-level alignment, profile generation, and evolutionary guiding), and includes ablations/runtime/reproducibility notes. 3. The Mask Sequence Designer idea is potentially interesting as a refinement stage, and (if properly isolated) could be a
1. Limited novelty relative to prior practice The central idea, retrieving close structural neighbors, aligning them, and converting to residue-wise priors, is well-established in the protein modeling literature (retrieval + alignment + MSA/profile conditioning). The paper’s main novelty appears to lie in integrating these established components within a diffusion framework plus an MSD refinement module. However, retrieval and alignment themselves employ existing tools and known procedures rath
**S1**. While the core idea of leveraging structurally similar proteins through retrieval is not entirely novel in the protein design community, the authors have identified an important and promising research direction that addresses a fundamental limitation of current deep learning approaches. The motivation to incorporate evolutionary information stored in vast protein databases represents a scientifically sound approach. This direction aligns well with the growing recognition that external kn
**W1**. The manuscript lacks crucial guidance and systematic analysis regarding the selection of $k$, the number of retrieved proteins, which appears to be a key hyperparameter in the proposed framework. The authors do not provide ablation studies showing how performance varies with different values of $k$, nor do they discuss the trade-offs between retrieval coverage and noise introduction from less relevant structures. Furthermore, there is no discussion of whether $k$ should be adapted based
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Machine Learning in Bioinformatics · RNA and protein synthesis mechanisms
