ESM All-Atom: Multi-scale Protein Language Model for Unified Molecular Modeling
Kangjie Zheng (equal contribution), Siyu Long (equal contribution),, Tianyu Lu, Junwei Yang, Xinyu Dai, Ming Zhang, Zaiqing Nie, Wei-Ying Ma, Hao, Zhou

TL;DR
ESM All-Atom (ESM-AA) is a novel protein language model that unifies atom-level and residue-level molecular modeling, enabling enhanced understanding and application in protein engineering and protein-molecule tasks.
Contribution
The paper introduces ESM-AA, a multi-scale protein language model that operates at both atom and residue levels, overcoming previous limitations of residue-only models.
Findings
ESM-AA outperforms previous methods in protein-molecule tasks.
It captures relationships among residues and atoms effectively.
It retains protein understanding while gaining molecular knowledge.
Abstract
Protein language models have demonstrated significant potential in the field of protein engineering. However, current protein language models primarily operate at the residue scale, which limits their ability to provide information at the atom level. This limitation prevents us from fully exploiting the capabilities of protein language models for applications involving both proteins and small molecules. In this paper, we propose ESM-AA (ESM All-Atom), a novel approach that enables atom-scale and residue-scale unified molecular modeling. ESM-AA achieves this by pre-training on multi-scale code-switch protein sequences and utilizing a multi-scale position encoding to capture relationships among residues and atoms. Experimental results indicate that ESM-AA surpasses previous methods in protein-molecule tasks, demonstrating the full utilization of protein language models. Further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Genomics and Phylogenetic Studies · Genetics, Bioinformatics, and Biomedical Research
