A Mixture of Experts Foundation Model for Scanning Electron Microscopy Image Analysis
Sk Miraj Ahmed, Yuewei Lin, Chuntian Cao, Shinjae Yoo, Xinpei Wu, Won-Il Lee, Nikhil Tiwale, Dan N. Le, Thi Thu Huong Chu, Jiyoung Kim, Kevin G. Yager, Chang-Yong Nam

TL;DR
This paper introduces the first foundation model for SEM images, trained on diverse micrographs, enabling improved generalization and task adaptation in automated microscopy and materials science applications.
Contribution
It presents a self-supervised transformer-based foundation model for SEM images, capable of generalizing across various conditions and tasks, including defocus-to-focus image translation.
Findings
Model outperforms state-of-the-art in defocus-to-focus translation
Pretrained on diverse micrographs for broad applicability
Enables transfer learning for various SEM tasks
Abstract
Scanning Electron Microscopy (SEM) is indispensable in modern materials science, enabling high-resolution imaging across a wide range of structural, chemical, and functional investigations. However, SEM imaging remains constrained by task-specific models and labor-intensive acquisition processes that limit its scalability across diverse applications. Here, we introduce the first foundation model for SEM images, pretrained on a large corpus of multi-instrument, multi-condition scientific micrographs, enabling generalization across diverse material systems and imaging conditions. Leveraging a self-supervised transformer architecture, our model learns rich and transferable representations that can be fine-tuned or adapted to a wide range of downstream tasks. As a compelling demonstration, we focus on defocus-to-focus image translation-an essential yet underexplored challenge in automated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
