Is Self-Supervised Pretraining Good for Extrapolation in Molecular Property Prediction?
Shun Takashige, Masatoshi Hanai, Toyotaro Suzumura, Limin Wang and, Kenjiro Taura

TL;DR
This paper investigates whether self-supervised pretraining can improve the ability of machine learning models to extrapolate unobserved material properties, showing it helps models learn relative tendencies despite not predicting exact values.
Contribution
The study provides an experimental framework demonstrating that self-supervised pretraining enhances relative extrapolation in material property prediction tasks.
Findings
Self-supervised pretraining improves models' ability to learn relative property tendencies.
Models still struggle with accurate absolute property extrapolation.
Pretraining enables better generalization beyond observed data.
Abstract
The prediction of material properties plays a crucial role in the development and discovery of materials in diverse applications, such as batteries, semiconductors, catalysts, and pharmaceuticals. Recently, there has been a growing interest in employing data-driven approaches by using machine learning technologies, in combination with conventional theoretical calculations. In material science, the prediction of unobserved values, commonly referred to as extrapolation, is particularly critical for property prediction as it enables researchers to gain insight into materials beyond the limits of available data. However, even with the recent advancements in powerful machine learning models, accurate extrapolation is still widely recognized as a significantly challenging problem. On the other hand, self-supervised pretraining is a machine learning technique where a model is first trained on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · X-ray Diffraction in Crystallography · Computational Drug Discovery Methods
