Establishing Deep InfoMax as an effective self-supervised learning methodology in materials informatics
Michael Moran, Vladimir V. Gusev, Michael W. Gaultois, Dmytro Antypov,, Matthew J. Rosseinsky

TL;DR
This paper demonstrates that Deep InfoMax, a self-supervised learning method, effectively enhances property prediction in materials informatics by leveraging unlabeled crystal data to improve models trained on small datasets.
Contribution
The study introduces Deep InfoMax pretraining for materials informatics, enabling better property prediction with limited labeled data without reconstructing crystal structures.
Findings
Deep InfoMax improves property prediction accuracy on small datasets.
Pretraining with Deep InfoMax enhances transfer learning for band gap and formation energy.
The approach effectively leverages unlabeled crystal data for supervised tasks.
Abstract
The scarcity of property labels remains a key challenge in materials informatics, whereas materials data without property labels are abundant in comparison. By pretraining supervised property prediction models on self-supervised tasks that depend only on the "intrinsic information" available in any Crystallographic Information File (CIF), there is potential to leverage the large amount of crystal data without property labels to improve property prediction results on small datasets. We apply Deep InfoMax as a self-supervised machine learning framework for materials informatics that explicitly maximises the mutual information between a point set (or graph) representation of a crystal and a vector representation suitable for downstream learning. This allows the pretraining of supervised models on large materials datasets without the need for property labels and without requiring the model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsManufacturing Process and Optimization · Machine Learning in Materials Science
MethodsSparse Evolutionary Training
