Learning to Trust AI and Data-driven models in Data Assimilation through a Multifidelity Ensemble Gaussian Mixture Filter Framework
Andrey A. Popov

TL;DR
This paper introduces a multifidelity ensemble Gaussian mixture filter that adaptively combines theory-driven and data-driven models to improve trustworthiness and accuracy in high-dimensional data assimilation tasks.
Contribution
It proposes a novel framework that uses bandwidth scaling factors in kernel density estimates to measure and adaptively compute trust between models, enhancing data assimilation reliability.
Findings
Validated on a static banana problem and Lorenz '96 equations.
Demonstrated high-dimensional convergence in undersampled regimes.
Showed adaptive trust improves data assimilation accuracy.
Abstract
AI and data-driven models have large potential for data assimilation applications by creating fast and accurate forecasts. Their tendency to produce spurious inaccurate, nonphysical results -- hallucination -- however, raises a serious question about their long-term use, and can be categorized as untrustworthy methods. Theory-driven methods on the other hand are slow, but are capable of staying physically realistic due to their mathematical underpinning, and can be categorized as trustworthy methods. We argue that by making use of these methods in tandem, it is possible to build a relative measure of trust between the theory-driven and data-driven methods that results in a combined trustworthy methodology. We argue, and then show, that the bandwidth scaling factors in the kernel density estimates can be used to represent our trust in the theory-driven and data-driven models. We provide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
