Bayesian Nonparametric Classification for Incomplete Data With a High Missing Rate: an Application to Semiconductor Manufacturing Data
Sewon Park, Kyeongwon Lee, Da-Eun Jeong, Heung-Kook Ko, and Jaeyong, Lee

TL;DR
This paper introduces a Dirichlet process-naive Bayes model tailored for classifying semiconductor manufacturing data with high missing rates and complex distributions, improving prediction accuracy over existing methods.
Contribution
The paper presents a novel Bayesian nonparametric classification method capable of handling high missing rates and non-normal data distributions in manufacturing datasets.
Findings
DPNB outperforms MICE and MissForest in missing value prediction.
Effective handling of highly non-normal data distributions.
Robust performance with increasing missing data percentages.
Abstract
During the semiconductor manufacturing process, predicting the yield of the semiconductor is an important problem. Early detection of defective product production in the manufacturing process can save huge production cost. The data generated from the semiconductor manufacturing process have characteristics of highly non-normal distributions, complicated missing patterns and high missing rate, which complicate the prediction of the yield. We propose Dirichlet process - naive Bayes model (DPNB), a classification method based on the mixtures of Dirichlet process and naive Bayes model. Since the DPNB is based on the mixtures of Dirichlet process and learns the joint distribution of all variables involved, it can handle highly non-normal data and can make predictions for the test dataset with any missing patterns. The DPNB also performs well for high missing rates since it uses all…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection
