Boltz is a Strong Baseline for Atom-level Representation Learning
Hyosoon Jang, Hyunjin Seo, Yunhui Jang, Seonghyun Park, Sungsoo Ahn

TL;DR
This paper evaluates Boltz, a protein-focused model operating at atom-level granularity, for small-molecule tasks, demonstrating its competitive performance and establishing it as a strong baseline for atom-level molecular representation learning.
Contribution
The study explores Boltz's atom-level representations for small-molecule tasks, revealing its effectiveness and positioning it as a new strong baseline in the field.
Findings
Boltz performs competitively on ADMET property prediction.
Boltz is effective for molecular generation and optimization.
Protein-centric models can capture transferable chemical physics.
Abstract
Foundation models in molecular learning have advanced along two parallel tracks: protein models, which typically utilize evolutionary information to learn amino acid-level representations for folding, and small-molecule models, which focus on learning atom-level representations for property prediction tasks such as ADMET. Notably, cutting-edge protein-centric models such as Boltz now operate at atom-level granularity for protein-ligand co-folding, yet their atom-level expressiveness for small-molecule tasks remains unexplored. A key open question is whether these protein co-folding models capture transferable chemical physics or rely on protein evolutionary signals, which would limit their utility for small-molecule tasks. In this work, we investigate the quality of Boltz atom-level representations across diverse small-molecule benchmarks. Our results show that Boltz is competitive with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Protein Structure and Dynamics · Computational Drug Discovery Methods
