Robust Amortized Bayesian Inference with Self-Consistency Losses on Unlabeled Data

Aayush Mishra; Daniel Habermann; Marvin Schmitt; Stefan T. Radev; Paul-Christian B\"urkner

arXiv:2501.13483·stat.ML·March 4, 2026

Robust Amortized Bayesian Inference with Self-Consistency Losses on Unlabeled Data

Aayush Mishra, Daniel Habermann, Marvin Schmitt, Stefan T. Radev, Paul-Christian B\"urkner

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a semi-supervised Bayesian inference method that uses unlabeled data and self-consistency losses to improve robustness and accuracy, especially on out-of-distribution observations.

Contribution

It proposes a novel semi-supervised approach leveraging self-consistency losses for robust amortized Bayesian inference on unlabeled and real-world data.

Findings

01

Significantly improves robustness of Bayesian inference on out-of-distribution data

02

Maintains high accuracy on real-world high-dimensional time-series and image data

03

Outperforms traditional methods in safety-critical applications

Abstract

Amortized Bayesian inference (ABI) with neural networks can solve probabilistic inverse problems orders of magnitude faster than classical methods. However, ABI is not yet sufficiently robust for widespread and safe application. When performing inference on observations outside the scope of the simulated training data, posterior approximations are likely to become highly biased, which cannot be corrected by additional simulations due to the bad pre-asymptotic behavior of current neural posterior estimators. In this paper, we propose a semi-supervised approach that enables training not only on labeled simulated data generated from the model, but also on \textit{unlabeled} data originating from any source, including real data. To achieve this, we leverage Bayesian self-consistency properties that can be transformed into strictly proper losses that do not require knowledge of ground-truth…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 8Confidence 3

Strengths

The independence of true parameter values is the most striking feature of the proposed method. The method is justified by theories. Various numerical evidences including through simulation and real-data applications are strong and convincing.

Weaknesses

The conditions of propositions 2 and 3 are not clear to me (See question below). Figure 4 (b) is a little bit misleading. Overall, these are minor defects.

Reviewer 02Rating 2Confidence 4

Strengths

- The writing and proposed methodology are clear and easy to understand - The proposed regularizer is intuitive: as the true posterior demonstrates self-consistency property, it’s a natural extension for the variational posterior to satisfy this condition (approximately) - The problem setting is a significant one, as distribution shift in simulation-based inference (SBI) problems continues to be a robust area of research. - The others take care to formalize their result more carefully with resp

Weaknesses

- The method is combinatorial: the NPE (score-based) objective for SBI is well-established, and Schmitt et al. (2024) introduced the self-consistency loss. - I don’t find the degree of novelty to be adequate enough to differentiate the work from Schmitt et al. In the case where the simulation model is correct, the proposed method is exactly identical to Schmitt et al. Thus, the authors’ main contribution, to me, seems to be applying the method to the setting of a misspecified simulator. While th

Reviewer 03Rating 6Confidence 3

Strengths

Improving robustness of SBI to data distribution shifts in an important problem.

Weaknesses

- The abstract uses "bad pre-asymptotic behavior" without definition, making the core problem inaccessible. - When would practitioners have unlabeled real data but not be able to generate more simulations? - The paper claims "high-dimensional" capability but experiments max out at 100 parameters with significant performance degradation (Figure 2a shows MMD increasing substantially). The MNIST example (784D) is modest by modern standards. Please scale back claims. - Head-to-head comparisons on

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Bayesian Modeling and Causal Inference