# PubChemQC PM6: A dataset of 221 million molecules with optimized   molecular geometries and electronic properties

**Authors:** Maho Nakata, Tomomi Shimazaki, Masatomo Hashimoto, Toshiyuki Maeda

arXiv: 1904.06046 · 2020-11-10

## TL;DR

This paper introduces the PubChemQC PM6 dataset, comprising optimized geometries and electronic properties for 221 million molecules, significantly expanding resources for computational chemistry research.

## Contribution

It presents the largest dataset of molecular geometries and electronic properties calculated by PM6, covering multiple electronic states for a vast number of molecules from PubChem.

## Key findings

- Largest dataset of its kind with 221 million molecules
- Includes multiple electronic states for each molecule
- Freely available under CC BY 4.0 license

## Abstract

We report on the largest dataset of optimized molecular geometries and electronic properties calculated by the PM6 method for 92.9% of the 91.2 million molecules cataloged in PubChem Compounds retrieved on Aug. 29, 2016. In addition to neutral states, we also calculated those for cationic, anionic, and spin flipped electronic states of 56.2%, 49.7%, and 41.3% of the molecules, respectively. Thus, the grand total calculated is 221 million molecules. The dataset is available at http://pubchemqc.riken.jp/pm6_dataset.html under the Creative Commons Attribution 4.0 International license.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.06046/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1904.06046/full.md

## References

59 references — full list in the complete paper: https://tomesphere.com/paper/1904.06046/full.md

---
Source: https://tomesphere.com/paper/1904.06046