MolChord: Structure-Sequence Alignment for Protein-Guided Drug Design
Wei Zhang, Zekun Guo, Yingce Xia, Peiran Jin, Shufang Xie, Tao Qin, Xiang-Yang Li

TL;DR
MolChord introduces a novel structure-sequence alignment method for protein-guided drug design, integrating text, structural, and property data with advanced models to improve alignment accuracy and drug property targeting.
Contribution
The paper presents MolChord, a new framework combining text-based and structural representations with property-aware optimization for improved structure-based drug design.
Findings
Achieves state-of-the-art performance on CrossDocked2020
Effectively aligns protein and molecule structures with textual descriptions
Guides molecules toward desired pharmacological properties
Abstract
Structure-based drug design (SBDD), which maps target proteins to candidate molecular ligands, is a fundamental task in drug discovery. Effectively aligning protein structural representations with molecular representations, and ensuring alignment between generated drugs and their pharmacological properties, remains a critical challenge. To address these challenges, we propose MolChord, which integrates two key techniques: (1) to align protein and molecule structures with their textual descriptions and sequential representations (e.g., FASTA for proteins and SMILES for molecules), we leverage NatureLM, an autoregressive model unifying text, small molecules, and proteins, as the molecule generator, alongside a diffusion-based structure encoder; and (2) to guide molecules toward desired properties, we curate a property-aware dataset by integrating preference data and refine the alignment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Protein Structure and Dynamics · Biomedical Text Mining and Ontologies
