QMe14S, A Comprehensive and Efficient Spectral Dataset for Small Organic Molecules
Mingzhi Yuan, Zihan Zou, and Wei Hu

TL;DR
The paper introduces QMe14S, a large, detailed spectral dataset for small organic molecules, enabling improved machine learning models for molecular simulations and spectra prediction.
Contribution
It provides a comprehensive, high-quality dataset with diverse molecules and properties, and demonstrates its effectiveness in training superior spectral prediction models.
Findings
Models trained on QMe14S outperform those on QM9S in spectral simulations.
QMe14S covers 186,102 molecules with extensive properties and spectra.
The dataset enhances machine learning protocols for molecular structure-property analysis.
Abstract
Developing machine learning protocols for molecular simulations requires comprehensive and efficient datasets. Here we introduce the QMe14S dataset, comprising 186,102 small organic molecules featuring 14 elements (H, B, C, N, O, F, Al, Si, P, S, Cl, As, Se, Br) and 47 functional groups. Using density functional theory at the B3LYP/TZVP level, we optimized the geometries and calculated properties including energy, atomic charge, atomic force, dipole moment, quadrupole moment, polarizability, octupole moment, first hyperpolarizability, and Hessian. At the same level, we obtained the harmonic IR, Raman and NMR spectra. Furthermore, we conducted ab initio molecular dynamics simulations to generate dynamic configurations and extract nonequilibrium properties, including energy, forces, and Hessians. By leveraging our E(3)-equivariant message-passing neural network (DetaNet), we demonstrated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Inorganic Chemistry and Materials · History and advancements in chemistry
