SaDiT: Efficient Protein Backbone Design via Latent Structural Tokenization and Diffusion Transformers
Shentong Mo, Lanqing Li

TL;DR
SaDiT is a novel protein backbone design framework that combines structural tokenization and diffusion transformers to achieve faster generation while maintaining structural quality, outperforming existing models in speed and viability.
Contribution
SaDiT introduces a discrete latent space for protein geometry and an IPA Token Cache to accelerate diffusion-based backbone design, a novel approach in the field.
Findings
SaDiT significantly outperforms RFDiffusion and Proteina in speed and structural quality.
SaDiT effectively captures complex topological features in protein structures.
The model demonstrates high designability in fold-class conditional generation tasks.
Abstract
Generative models for de novo protein backbone design have achieved remarkable success in creating novel protein structures. However, these diffusion-based approaches remain computationally intensive and slower than desired for large-scale structural exploration. While recent efforts like Proteina have introduced flow-matching to improve sampling efficiency, the potential of tokenization for structural compression and acceleration remains largely unexplored in the protein domain. In this work, we present SaDiT, a novel framework that accelerates protein backbone generation by integrating SaProt Tokenization with a Diffusion Transformer (DiT) architecture. SaDiT leverages a discrete latent space to represent protein geometry, significantly reducing the complexity of the generation process while maintaining theoretical SE(3) equivalence. To further enhance efficiency, we introduce an IPA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Scientific Computing and Data Management · Biochemical and Structural Characterization
