Effect of pseudo datasets for the classification-based engineering design
Xianping Du, Kai Zhang, Onur Bilgen, Laurent Burlion, and Hongyi Xu

TL;DR
This study evaluates how pseudo datasets generated by surrogate models influence classification accuracy in engineering design, highlighting the effectiveness of certain classifiers and the impact of surrogate uncertainty.
Contribution
It investigates the mutual effects of pseudo datasets and surrogate modeling uncertainty on classification performance in engineering design.
Findings
Large pseudo datasets improve classification accuracy depending on the problem and algorithm.
Support vector machine, random forest, and neural network outperform Naive Bayes with pseudo data.
Random forest shows high robustness under surrogate uncertainty, while neural networks are more sensitive.
Abstract
Machine learning classification techniques have been used widely to recognize the feasible design domain and discover hidden patterns in engineering design. An accurate classification model needs a large dataset; however, generating a large dataset is costly for complex simulation-based problems. After training by a small dataset, surrogate models can generate a large pseudo dataset efficiently. Errors, however, may be introduced by surrogate modeling. This paper investigates the mutual effect of a large pseudo dataset and surrogate modeling uncertainty. Four widely used methods, i.e., Naive Bayes classifier, support vector machine, random forest regression, and artificial neural network for classification, are studied on four benchmark problems. Kriging is used as the basic surrogate model method. The results show that a large pseudo dataset improves the classification accuracy, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Multi-Objective Optimization Algorithms · Optimal Experimental Design Methods · Probabilistic and Robust Engineering Design
