CHAOS -- A Consistent Large-scale Database for Sigma-Profiles and Other Molecular Descriptors
Dominik Gond, Justus Arweiler, Thomas Specht, Hans Hasse, Fabian Jirasek

TL;DR
CHAOS is a large, consistent database of sigma-profiles and quantum-chemical descriptors for over 53,000 molecules, enabling advanced modeling in chemistry and materials science.
Contribution
It introduces a comprehensive, standardized, and publicly available database of molecular descriptors, significantly expanding existing resources for research and modeling.
Findings
Provides sigma-profiles for 53,091 molecules, over ten times more than previous databases.
Includes diverse quantum-chemical observables like IR spectra, NMR tensors, and thermodynamic data.
Ensures data consistency through a standardized computational workflow.
Abstract
Sigma-profiles obtained from quantum-chemical calculations are key molecular descriptors for solvent selection, thermodynamic modeling, and data-driven molecular design. However, existing sigma-profile libraries are limited in size and inconsistent in quality, which restricts their utility. In this work, we introduce CHAOS (Computed High-Accuracy Observables and Sigma Profiles), a large-scale and internally consistent database providing sigma-profiles for 53091 molecules, along with additional quantum-chemical observables including gas-phase geometries, single-point conductor-like polarizable continuum (C-PCM) data, infrared spectra, ideal-gas heat capacities and entropies, and atomic orbital nuclear magnetic resonance (NMR) shielding tensors. All data were generated using a standardized quantum-chemical workflow based on an wB97X-D/def2-TZVP level of theory. The CHAOS database covers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
