Data-driven probability concentration and sampling on manifold
Christian Soize (MSME), Roger Ghanem

TL;DR
This paper introduces a novel data-driven methodology combining kernel density estimation, MCMC sampling, diffusion maps, and reduced-order modeling to generate statistically consistent realizations of random vectors concentrated on unknown manifolds, with proven convergence and robustness.
Contribution
The paper presents a new integrated approach for probabilistic modeling on unknown manifolds using advanced data analysis and sampling techniques, enhancing uncertainty quantification and stochastic modeling.
Findings
Method is robust to noise and data complexity.
Effective in capturing data geometry and structure.
Validated through three complex numerical applications.
Abstract
A new methodology is proposed for generating realizations of a random vector with values in a finite-dimensional Euclidean space that are statistically consistent with a data set of observations of this vector. The probability distribution of this random vector, while a-priori not known, is presumed to be concentrated on an unknown subset of the Euclidean space. A random matrix is introduced whose columns are independent copies of the random vector and for which the number of columns is the number of data points in the data set. The approach is based on the use of (i) the multidimensional kernel-density estimation method for estimating the probability distribution of the random matrix, (ii) a MCMC method for generating realizations for the random matrix, (iii) the diffusion-maps approach for discovering and characterizing the geometry and the structure of the data set, and (iv) a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
