# Improving the Accuracy of Amortized Model Comparison with Self-Consistency

**Authors:** \v{S}imon Kucharsk\'y, Aayush Mishra, Daniel Habermann, Stefan T. Radev, Paul-Christian B\"urkner

arXiv: 2508.20614 · 2026-05-13

## TL;DR

This paper enhances amortized Bayesian model comparison accuracy under model misspecification by introducing a self-consistency loss, validated through artificial and real-world case studies.

## Contribution

It proposes a self-consistency training method for amortized BMC that improves estimates especially when models are misspecified or data distribution shifts occur.

## Key findings

- Self-consistency training improves BMC estimates under distribution shifts.
- Classifier-based BMC methods benefit less from self-consistency training.
- Self-consistency training significantly enhances BMC accuracy in open-world, misspecified model scenarios.

## Abstract

Amortized Bayesian model comparison (BMC) enables fast probabilistic ranking of models via simulation-based training of neural surrogates. However, the accuracy of neural surrogates deteriorates when simulation models are misspecified; the very case where model comparison is most needed. We evaluate four different amortized BMC methods. We supplement traditional simulation-based training of these methods with a \emph{self-consistency} (SC) loss on unlabeled real data to improve BMC estimates under distribution shifts. Using one artificial and two real-world case studies, we compare amortized BMC estimators with and without SC against analytic or bridge sampling benchmarks. In the \emph{closed-world} case (data is generated by one of the candidate models), BMC estimators using classifiers work acceptably well even without SC training. However, these methods also benefit the least from SC training. In the \emph{open-world} scenario (all models misspecified), SC training strongly improves BMC estimators when having access to analytic likelihoods, or when surrogate likelihoods are locally accurate near the true parameter posterior, even for severely misspecified models. We conclude with practical recommendations for amortized BMC and suggestions for future research.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.20614/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/2508.20614/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/2508.20614/full.md

---
Source: https://tomesphere.com/paper/2508.20614