Bayesian Surrogate Training on Multiple Data Sources: A Hybrid Modeling Strategy

Philipp Reiser; Paul-Christian B\"urkner; Anneli Guthke

arXiv:2412.11875·stat.ML·May 13, 2026

Bayesian Surrogate Training on Multiple Data Sources: A Hybrid Modeling Strategy

Philipp Reiser, Paul-Christian B\"urkner, Anneli Guthke

PDF

TL;DR

This paper introduces two probabilistic hybrid modeling strategies to enhance surrogate models by integrating simulation and real-world data, improving accuracy and diagnosing model issues.

Contribution

The paper proposes two novel methods for combining simulation and measurement data in surrogate training, including a new weighting strategy independent of surrogate type.

Findings

01

Hybrid approaches improve predictive accuracy and coverage.

02

The methods help diagnose simulation model problems.

03

Synthetic and real-world case studies validate the approaches.

Abstract

Surrogate models are often used as computationally efficient approximations to complex simulation models, enabling tasks such as solving inverse problems, sensitivity analysis, and probabilistic forward predictions, which would otherwise be computationally infeasible. During training, surrogate parameters are fitted such that the surrogate reproduces the simulation model's outputs as closely as possible. However, the simulation model itself is merely a simplification of the real-world system, often missing relevant processes or suffering from misspecifications e.g., in inputs or boundary conditions. Hints about these might be captured in real-world measurement data, and yet, we typically ignore those hints during surrogate building. In this paper, we propose two novel probabilistic approaches to integrate simulation data and real-world measurement data during surrogate training. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.