MCMC to address model misspecification in Deep Learning classification of Radio Galaxies
Devina Mohan, Anna Scaife

TL;DR
This paper investigates the use of MCMC sampling to address model misspecification issues in Bayesian neural networks applied to radio galaxy classification, highlighting the limitations of Gaussian variational approximations and the cold posterior effect.
Contribution
It demonstrates that MCMC sampling reveals the inadequacy of Gaussian approximations and explains the cold posterior effect in Bayesian neural networks for radio galaxy classification.
Findings
Gaussian variational family is a poor approximation to the true posterior
MCMC sampling exposes the causes of the cold posterior effect
Addressing model misspecification improves uncertainty estimates in classification
Abstract
The radio astronomy community is adopting deep learning techniques to deal with the huge data volumes expected from the next-generation of radio observatories. Bayesian neural networks (BNNs) provide a principled way to model uncertainty in the predictions made by deep learning models and will play an important role in extracting well-calibrated uncertainty estimates from the outputs of these models. However, most commonly used approximate Bayesian inference techniques such as variational inference and MCMC-based algorithms experience a "cold posterior effect (CPE)", according to which the posterior must be down-weighted in order to get good predictive performance. The CPE has been linked to several factors such as data augmentation or dataset curation leading to a misspecified likelihood and prior misspecification. In this work we use MCMC sampling to show that a Gaussian parametric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Bayesian Methods and Mixture Models · Target Tracking and Data Fusion in Sensor Networks
