A Systematic Comparison of Bayesian Deep Learning Robustness in Diabetic Retinopathy Tasks
Angelos Filos, Sebastian Farquhar, Aidan N. Gomez, Tim G. J. Rudner,, Zachary Kenton, Lewis Smith, Milad Alizadeh, Arnoud de Kroon, Yarin Gal

TL;DR
This paper introduces a new benchmark for evaluating Bayesian deep learning methods in medical imaging, specifically diabetic retinopathy diagnosis, emphasizing real-world robustness and uncertainty estimation over traditional benchmarks.
Contribution
The authors propose a diverse, real-world inspired benchmark for BDL evaluation in medical imaging, addressing limitations of previous benchmarks like UCI.
Findings
Some current BDL techniques overfit to datasets in traditional benchmarks.
Simpler baselines can outperform complex BDL methods on the new benchmark.
The benchmark includes tasks like out-of-distribution detection and robustness assessment.
Abstract
Evaluation of Bayesian deep learning (BDL) methods is challenging. We often seek to evaluate the methods' robustness and scalability, assessing whether new tools give `better' uncertainty estimates than old ones. These evaluations are paramount for practitioners when choosing BDL tools on-top of which they build their applications. Current popular evaluations of BDL methods, such as the UCI experiments, are lacking: Methods that excel with these experiments often fail when used in application such as medical or automotive, suggesting a pertinent need for new benchmarks in the field. We propose a new BDL benchmark with a diverse set of tasks, inspired by a real-world medical imaging application on \emph{diabetic retinopathy diagnosis}. Visual inputs (512x512 RGB images of retinas) are considered, where model uncertainty is used for medical pre-screening---i.e. to refer patients to an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Retinal Imaging and Analysis · Machine Learning and Data Classification
