Miller-Index-Based Latent Crystallographic Fracture Plane Reasoning with Vision-Language Models
Qinwu Xu, Yifan Jiang

TL;DR
This paper investigates whether multimodal large language models can use Miller indices as structured latent representations to reason about fracture geometry in various materials.
Contribution
It introduces a framework for MLLMs to perform physics-aware latent inference and applicability assessment based on crystallographic plane indices.
Findings
MLLMs can reliably infer latent fracture planes in idealized settings.
MLLMs can reject latent representations when physics do not support them.
The approach works across synthetic, geometric, and real-world fracture images.
Abstract
We study whether multimodal large language models (MLLMs) can leverage crystallographic plane indices (Miller indices) as a structured latent representation for reasoning about fracture geometry. We formulate Miller indices as a latent variable governing idealized planar fracture and evaluate two complementary capabilities: (i) latent inference, where the model maps visual observations to plane hypotheses under physically valid conditions, and (ii) latent applicability assessment, where the model determines whether such a representation is meaningful for a given fracture image. Through extensive experiments spanning synthetic data, controlled 2D--3D geometric pairs, and real-world fracture images across multiple material classes -- including ceramics, glass, metals, and concrete -- we show that MLLMs can reliably perform latent inference in idealized settings and,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
