MELT: Materials-aware Continued Pre-training for Language Model Adaptation to Materials Science
Junho Kim, Yeachan Kim, Jun-Hyung Park, Yerim Oh, Suho Kim, SangKeun, Lee

TL;DR
MELT is a materials-aware continued pre-training method that enhances language models for materials science by integrating domain knowledge and curriculum learning, leading to improved representation and performance.
Contribution
This paper introduces MELT, a novel adaptation strategy that combines knowledge graph construction and curriculum-based training for better materials science language modeling.
Findings
MELT outperforms existing pre-training methods on multiple benchmarks.
It effectively captures materials entities and concepts.
Demonstrates broad applicability across materials science tasks.
Abstract
We introduce a novel continued pre-training method, MELT (MatEriaLs-aware continued pre-Training), specifically designed to efficiently adapt the pre-trained language models (PLMs) for materials science. Unlike previous adaptation strategies that solely focus on constructing domain-specific corpus, MELT comprehensively considers both the corpus and the training strategy, given that materials science corpus has distinct characteristics from other domains. To this end, we first construct a comprehensive materials knowledge base from the scientific corpus by building semantic graphs. Leveraging this extracted knowledge, we integrate a curriculum into the adaptation process that begins with familiar and generalized concepts and progressively moves toward more specialized terms. We conduct extensive experiments across diverse benchmarks to verify the effectiveness and generality of MELT. A…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning in Materials Science
MethodsBalanced Selection · Focus
