Advancing biomolecular understanding and design following human instructions
Xiang Zhuang, Keyan Ding, Tianwen Lyu, Yinuo Jiang, Xiaotong Li, Zhuoyi Xiang, Zeyuan Wang, Ming Qin, Kehua Feng, Jike Wang, Qiang Zhang, Huajun Chen

TL;DR
InstructBioMol is a large language model that bridges natural language and biomolecular design, enabling precise, human-guided creation of drugs and enzymes with improved performance metrics.
Contribution
The paper introduces InstructBioMol, a novel large language model that aligns natural language with biomolecular data for human-instructed biomolecule design.
Findings
Generated drug molecules with 10% better binding affinity.
Designed enzymes with a substrate prediction score of 70.4.
Demonstrated effective understanding and execution of human instructions.
Abstract
Understanding and designing biomolecules, such as proteins and small molecules, is central to advancing drug discovery, synthetic biology and enzyme engineering. Recent breakthroughs in artificial intelligence have revolutionized biomolecular research, achieving remarkable accuracy in biomolecular prediction and design. However, a critical gap remains between artificial intelligence's computational capabilities and researchers' intuitive goals, particularly in using natural language to bridge complex tasks with human intentions. Large language models have shown potential to interpret human intentions, yet their application to biomolecular research remains nascent due to challenges including specialized knowledge requirements, multimodal data integration, and semantic alignment between natural language and biomolecules. To address these limitations, we present InstructBioMol, a large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetics, Bioinformatics, and Biomedical Research · Innovative Teaching Methods · Biomedical and Engineering Education
MethodsDilated Convolution · Hierarchical Feature Fusion · ALIGN · Pointwise Convolution · Efficient Spatial Pyramid
