Molecule3D: A Benchmark for Predicting 3D Geometries from Molecular Graphs
Zhao Xu, Youzhi Luo, Xuan Zhang, Xinyi Xu, Yaochen Xie, Meng Liu,, Kaleb Dickerson, Cheng Deng, Maho Nakata, Shuiwang Ji

TL;DR
Molecule3D introduces a benchmark dataset and tools for predicting 3D molecular geometries from graphs, enabling more efficient property prediction without costly quantum calculations.
Contribution
The paper presents a large-scale dataset and software tools for machine learning-based 3D geometry prediction from molecular graphs, reducing computational costs.
Findings
Predictive methods achieve accuracy comparable to RDKit with less computation
Benchmark dataset of 4 million molecules enables extensive evaluation
Baseline models demonstrate effective geometry prediction
Abstract
Graph neural networks are emerging as promising methods for modeling molecular graphs, in which nodes and edges correspond to atoms and chemical bonds, respectively. Recent studies show that when 3D molecular geometries, such as bond lengths and angles, are available, molecular property prediction tasks can be made more accurate. However, computing of 3D molecular geometries requires quantum calculations that are computationally prohibitive. For example, accurate calculation of 3D geometries of a small molecule requires hours of computing time using density functional theory (DFT). Here, we propose to predict the ground-state 3D geometries from molecular graphs using machine learning methods. To make this feasible, we develop a benchmark, known as Molecule3D, that includes a dataset with precise ground-state geometries of approximately 4 million molecules derived from DFT. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Protein Structure and Dynamics
