SiDGen: Structure-informed Diffusion for Generative modeling of Ligands for Proteins
Samyak Sanghvi, Nishant Ranjan, Tarak Karmakar

TL;DR
SiDGen introduces a structure-informed diffusion model for ligand generation that balances interaction detail and computational efficiency, enabling scalable and accurate structure-based drug design.
Contribution
It proposes a novel Topological Information Bottleneck to compress protein representations, reducing computational costs while maintaining accuracy.
Findings
Achieves state-of-the-art results on key benchmarks.
Reduces memory and computational costs significantly.
Bridges the gap between sequence efficiency and structural detail.
Abstract
Structure-based drug design (SBDD) faces a fundamental scaling fidelity dilemma: rich pocket-aware conditioning captures interaction geometry but can be costly, often scales quadratically () or worse with protein length (), while efficient sequence-only conditioning can miss key interaction structure. We propose SiDGen, a structure-informed discrete diffusion framework that resolves this trade-off through a Topological Information Bottleneck (TIB). SiDGen leverages a learned, soft assignment mechanism to compress residue-level protein representations into a compact bottleneck enabling downstream pairwise computations on the coarse grid (). This design reduces memory and computational cost without compromising generative accuracy. Our approach achieves state-of-the-art performance on CrossDocked2020 and DUD-E benchmarks while significantly reducing pairwise-tensor…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Machine Learning in Materials Science · Computational Drug Discovery Methods
