SCALAR: Quantifying Structural Hallucination, Consistency, and Reasoning Gaps in Materials Foundation Models
Can Polat, Erchin Serpedin, Mustafa Kurban, Hasan Kurban

TL;DR
SCALAR introduces a comprehensive benchmark to evaluate how well materials foundation models generalize across scales, reasoning about structures, and maintain consistency, revealing complex behaviors and limitations of current models.
Contribution
This work presents SCALAR, a novel benchmark for assessing geometric scale generalization, structural reasoning, and hallucination in materials foundation models, highlighting their diverse behaviors.
Findings
Models show large, model-dependent shifts under explicit reasoning.
Explicit reasoning often reduces hallucination and error.
Scale generalization cannot be inferred from accuracy alone.
Abstract
Large language models are increasingly applied to materials science reasoning, yet their behavior under physically structured distribution shifts remains poorly understood. We introduce SCALAR (Structural Consistency And Logic Across Regimes), a benchmark for evaluating geometric scale generalization and its connection to structural hallucination, consistency, and reasoning in materials foundation models. Given canonical crystal representations, models must reason about derived nanoparticle structures obtained through supercell expansion and geometric truncation across length scales spanning a few atoms to over 18,000 atoms, totaling 100,000 structures from DFT-validated unit cells. SCALAR defines three tasks. (i) CIF to property prediction. (ii) A Chain-of-Thought variant with explicit physics-grounded reasoning. (iii) Inverse retrieval identifying crystals from candidates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Topological and Geometric Data Analysis
