Intrinsic Dimensionality of Molecular Properties
Ali Banjafar, Guido Falk von Rudorff

TL;DR
This paper demonstrates that accepting minimal error allows for a significant reduction in the intrinsic dimensionality of molecular properties, enhancing data efficiency and transferability of machine learning models in chemistry.
Contribution
It introduces a method to quantify the upper bound of intrinsic dimensionality of molecular properties considering all continuous variables, showing its stability across molecules.
Findings
Intrinsic dimensionality is stable across different molecules.
Accepting small errors drastically reduces the effective dimensionality.
The feature space of molecular representations can be further compressed.
Abstract
Chemical space which encompasses all stable compounds is unfathomably large and its dimension scales linearly with the number of atoms considered. The success of machine learning methods suggests that many physical quantities exhibit substantial redundancy in that space, lowering their effective dimensionality. A low dimensionality is favorable for machine learning applications, as it reduces the required number of data points. It is unknown however, how far the dimensionality of physical properties can be reduced, how this depends on the exact physical property considered, and how accepting a model error can help further reducing the dimensionality. We show that accepting a modest, nearly negligible error leads to a drastic reduction in independent degrees of freedom. This applies to several properties such as the total energy and frontier orbital energies for a wide range of neutral…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
