Composition-Weighted Symbolic Regression for General-Purpose Property Prediction
Yang Huang, Jingrun Chen

TL;DR
This paper presents a novel interpretable symbolic regression framework that predicts materials properties from composition, combining analytical form learning with elemental weighting, and enforces physical constraints.
Contribution
It introduces a composition-weighted symbolic regression method that unifies regression and classification, incorporating constraints and efficient search for materials property prediction.
Findings
Achieves competitive accuracy on MatBench benchmarks with explicit formulas.
Produces chemically meaningful elemental weights and smooth property trends.
Enforces physical constraints like non-negativity and bounded probabilities.
Abstract
We introduce a composition-weighted symbolic regression framework for interpretable prediction of materials properties directly from chemical composition. The method jointly learns analytical functional forms and task-dependent elemental weightings without predefined descriptors. By incorporating max/min operators, it naturally enforces constraints such as non-negative band gaps and bounded classification probabilities, unifying regression and classification tasks. Efficient search is achieved through a hybrid Monte Carlo tree search--genetic programming algorithm with gradient-based refinement and parallel computation. Benchmarks on MatBench tasks show competitive accuracy relative to state-of-the-art black-box models while yielding explicit analytical expressions. Applied to III--V semiconductor alloys, the model produces smooth composition-dependent trends and learned elemental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
