$\beta$-Cores: Robust Large-Scale Bayesian Data Summarization in the Presence of Outliers
Dionysis Manousakas, Cecilia Mascolo

TL;DR
This paper introduces a scalable, robust Bayesian inference method using $eta$-divergence and Riemannian coresets, effectively handling outliers in large datasets for diverse models.
Contribution
It proposes a novel variational inference approach that combines $eta$-divergence-based robustness with scalable Riemannian coreset techniques for large-scale Bayesian data summarization.
Findings
Outperforms existing methods in robustness to outliers
Effective on diverse datasets and models
Scalable to large datasets
Abstract
Modern machine learning applications should be able to address the intrinsic challenges arising over inference on massive real-world datasets, including scalability and robustness to outliers. Despite the multiple benefits of Bayesian methods (such as uncertainty-aware predictions, incorporation of experts knowledge, and hierarchical modeling), the quality of classic Bayesian inference depends critically on whether observations conform with the assumed data generating model, which is impossible to guarantee in practice. In this work, we propose a variational inference method that, in a principled way, can simultaneously scale to large datasets, and robustify the inferred posterior with respect to the existence of outliers in the observed data. Reformulating Bayes theorem via the -divergence, we posit a robustified pseudo-Bayesian posterior as the target of inference. Moreover,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning
MethodsCoresets
