TL;DR
This paper presents an iterative workflow to troubleshoot and ensure the stability of molecules in chemical space explorations, improving the reliability of quantum chemical datasets.
Contribution
The authors develop a connectivity-preserving geometry optimization workflow that effectively identifies and corrects unstable molecules in large quantum chemistry datasets.
Findings
Successfully troubleshot 2,988 molecules from QM9 dataset.
Identified 66 unstable molecules, including strained and nitrogen-nitroso compounds.
Inspected molecules with ultralong bonds, revealing structural insights.
Abstract
A key challenge in automated chemical compound space explorations is ensuring veracity in minimum energy geometries---to preserve intended bonding connectivities. We discuss an iterative high-throughput workflow for connectivity preserving geometry optimizations exploiting the nearness between quantum mechanical models. The methodology is benchmarked on the QM9 dataset comprising DFT-level properties of 133,885 small molecules; of which 3,054 have questionable geometric stability. We successfully troubleshoot 2,988 molecules and ensure a bijective mapping between desired Lewis formulae and final geometries. Our workflow, based on DFT and post-DFT methods, identifies 66 molecules as unstable; 52 contain , the rest are strained due to pyramidal sp C. In the curated dataset, we inspect molecules with long CC bonds and identify ultralong contestants (~\AA{})…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
