Bayesian Sampling Bias Correction: Training with the Right Loss Function

L. Le Folgoc; V. Baltatzis; A. Alansary; S. Desai; A. Devaraj; S.; Ellis; O. E. Martinez Manzanera; F. Kanavati; A. Nair; J. Schnabel; B.; Glocker

arXiv:2006.13798·cs.LG·June 25, 2020

Bayesian Sampling Bias Correction: Training with the Right Loss Function

L. Le Folgoc, V. Baltatzis, A. Alansary, S. Desai, A. Devaraj, S., Ellis, O. E. Martinez Manzanera, F. Kanavati, A. Nair, J. Schnabel, B., Glocker

PDF

Open Access

TL;DR

This paper introduces a Bayesian risk minimization framework to derive loss functions that correct sampling bias in training data, improving model performance in real-world medical imaging applications.

Contribution

It presents a novel family of bias-corrected loss functions derived from Bayesian principles, applicable to arbitrary likelihood models and seamlessly integrable with deep learning.

Findings

01

Bias correction improves model accuracy in medical imaging.

02

The method connects bias correction to information gain.

03

Case studies demonstrate practical effectiveness.

Abstract

We derive a family of loss functions to train models in the presence of sampling bias. Examples are when the prevalence of a pathology differs from its sampling rate in the training dataset, or when a machine learning practioner rebalances their training dataset. Sampling bias causes large discrepancies between model performance in the lab and in more realistic settings. It is omnipresent in medical imaging applications, yet is often overlooked at training time or addressed on an ad-hoc basis. Our approach is based on Bayesian risk minimization. For arbitrary likelihood models we derive the associated bias corrected loss for training, exhibiting a direct connection to information gain. The approach integrates seamlessly in the current paradigm of (deep) learning using stochastic backpropagation and naturally with Bayesian models. We illustrate the methodology on case studies of lung…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Advanced Statistical Process Monitoring · Machine Learning and Data Classification