Resolving transition metal chemical space: feature selection for machine learning and structure-property relationships
Jon Paul Janet, Heather J. Kulik

TL;DR
This paper introduces revised autocorrelation functions (RACs) for machine learning in transition metal chemistry, demonstrating improved accuracy in predicting properties like spin-state splitting, bond lengths, and redox potentials with smaller feature sets.
Contribution
The study develops RACs tailored for inorganic chemistry and systematically compares feature selection methods, achieving high accuracy with reduced feature sets for various properties.
Findings
RACs outperform standard topological descriptors in organic molecules.
ML models with selected features achieve 1 kcal/mol accuracy in spin-state splitting.
Feature importance varies with property, highlighting local electronic and steric effects.
Abstract
Machine learning (ML) of quantum mechanical properties shows promise for accelerating chemical discovery. For transition metal chemistry where accurate calculations are computationally costly and available training data sets are small, the molecular representation becomes a critical ingredient in ML model predictive accuracy. We introduce a series of revised autocorrelation functions (RACs) that encode relationships between the heuristic atomic properties (e.g., size, connectivity, and electronegativity) on a molecular graph. We alter the starting point, scope, and nature of the quantities evaluated in standard ACs to make these RACs amenable to inorganic chemistry. On an organic molecule set, we first demonstrate superior standard AC performance to other presently-available topological descriptors for ML model training, with mean unsigned errors (MUEs) for atomization energies on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
