TL;DR
This paper presents an automated, foundation model-based pipeline for accurate grain size estimation from microscopy images, integrating ASTM standards and demonstrating high few-shot accuracy.
Contribution
It introduces a novel pipeline adapting Cellpose-SAM for microstructures, achieving superior accuracy and robustness in automated grain size estimation.
Findings
Achieves as low as 1.50% MAPE with only two training samples.
Outperforms classical U-Net, MatSAM, and Qwen2.5-VL-7B in benchmarks.
Maintains topological separation and robustness across varying grain counts.
Abstract
Extracting standardized metallurgical metrics from microscopy images remains challenging due to complex grain morphology and the data demands of supervised segmentation. To bridge foundational computer vision with practical metallurgical evaluation, we propose an automated pipeline for dense instance segmentation and grain size estimation that adapts Cellpose-SAM to microstructures and integrates its topology-aware gradient tracking with an ASTM E112 Jeffries planimetric module. We systematically benchmark this pipeline against a classical convolutional network (U-Net), an adaptive-prompting vision foundation model (MatSAM) and a contemporary vision-language model (Qwen2.5-VL-7B). Our evaluations reveal that while the out-of-the-box vision-language model struggles with the localized spatial reasoning required for dense microscopic counting and MatSAM suffers from over-segmentation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
