WaferSAGE: Large Language Model-Powered Wafer Defect Analysis via Synthetic Data Generation and Rubric-Guided Reinforcement Learning
Ke Xu, Zhongyuan Lian

TL;DR
WaferSAGE introduces a novel framework leveraging synthetic data, structured rubrics, and reinforcement learning to enhance wafer defect analysis with small vision-language models, achieving competitive performance.
Contribution
The paper presents a new synthesis pipeline and reinforcement learning approach enabling small models to outperform larger proprietary models in wafer defect visual question answering.
Findings
Achieved a 6.493 LLM-Judge score with a 4B-parameter model.
Synthetic data generation improves defect analysis accuracy.
Small domain-specific models can surpass large proprietary models.
Abstract
We present WaferSAGE, a framework for wafer defect visual question answering using small vision-language models. To address data scarcity in semiconductor manufacturing, we propose a three-stage synthesis pipeline incorporating structured rubric generation for precise evaluation. Starting from limited labeled wafer maps, we employ clustering-based cleaning to filter label noise, then generate comprehensive defect descriptions using vision-language models, which are converted into structured evaluation rubrics criteria. These rubrics guide the synthesis of VQA pairs, ensuring coverage across defect type identification, spatial distribution, morphology, and root cause analysis. Our dual assessment framework aligns rule-based metrics with LLM-Judge scores via Bayesian optimization, enabling reliable automated evaluation. Through curriculum-based reinforcement learning with Group Sequence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
