Distributional encoding for Gaussian process regression with qualitative inputs
S\'ebastien Da Veiga (ENSAI, CREST, RT-UQ)

TL;DR
This paper introduces a distributional encoding method for Gaussian process regression with categorical inputs, improving predictive accuracy and computational efficiency, especially in Bayesian optimization tasks involving mixed variable types.
Contribution
The paper proposes a novel distributional encoding approach for categorical variables in Gaussian processes, utilizing characteristic kernels based on maximum mean discrepancy and Wasserstein distance.
Findings
Achieves state-of-the-art predictive performance on synthetic datasets
Enhances Gaussian process regression with categorical inputs
Complementary to recent Bayesian optimization methods
Abstract
Gaussian Process (GP) regression is a popular and sample-efficient approach for many engineering applications, where observations are expensive to acquire, and is also a central ingredient of Bayesian optimization (BO), a highly prevailing method for the optimization of black-box functions. However, when all or some input variables are categorical, building a predictive and computationally efficient GP remains challenging. Starting from the naive target encoding idea, where the original categorical values are replaced with the mean of the target variable for that category, we propose a generalization based on distributional encoding (DE) which makes use of all samples of the target variable for a category. To handle this type of encoding inside the GP, we build upon recent results on characteristic kernels for probability distributions, based on the maximum mean discrepancy and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Advanced Multi-Objective Optimization Algorithms
