Operator-induced structural variable selection for identifying materials genes
Shengbin Ye, Thomas P. Senftle, Meng Li

TL;DR
This paper introduces a novel variable selection method leveraging operator-induced structure to efficiently identify meaningful materials descriptors, improving accuracy and speed over existing techniques, with applications in catalysis research.
Contribution
The paper proposes a new operator-induced structure-based variable selection method that enables fast, accurate dimension reduction in high-dimensional materials informatics problems.
Findings
Method outperforms existing techniques in speed and accuracy.
Robust performance in high-dimensional settings.
Identifies physical descriptors explaining catalytic binding energy.
Abstract
In the emerging field of materials informatics, a fundamental task is to identify physicochemically meaningful descriptors, or materials genes, which are engineered from primary features and a set of elementary algebraic operators through compositions. Standard practice directly analyzes the high-dimensional candidate predictor space in a linear model; statistical analyses are then substantially hampered by the daunting challenge posed by the astronomically large number of correlated predictors with limited sample size. We formulate this problem as variable selection with operator-induced structure (OIS) and propose a new method to achieve unconventional dimension reduction by utilizing the geometry embedded in OIS. Although the model remains linear, we iterate nonparametric variable selection for effective dimension reduction. This enables variable selection based on ab initio primary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Molecular Biology Techniques and Applications
