$\beta$-Cores: Robust Large-Scale Bayesian Data Summarization in the   Presence of Outliers

Dionysis Manousakas; Cecilia Mascolo

arXiv:2008.13600·cs.LG·November 10, 2020

$\beta$-Cores: Robust Large-Scale Bayesian Data Summarization in the Presence of Outliers

Dionysis Manousakas, Cecilia Mascolo

PDF

Open Access 1 Repo

TL;DR

This paper introduces a scalable, robust Bayesian inference method using $eta$-divergence and Riemannian coresets, effectively handling outliers in large datasets for diverse models.

Contribution

It proposes a novel variational inference approach that combines $eta$-divergence-based robustness with scalable Riemannian coreset techniques for large-scale Bayesian data summarization.

Findings

01

Outperforms existing methods in robustness to outliers

02

Effective on diverse datasets and models

03

Scalable to large datasets

Abstract

Modern machine learning applications should be able to address the intrinsic challenges arising over inference on massive real-world datasets, including scalability and robustness to outliers. Despite the multiple benefits of Bayesian methods (such as uncertainty-aware predictions, incorporation of experts knowledge, and hierarchical modeling), the quality of classic Bayesian inference depends critically on whether observations conform with the assumed data generating model, which is impossible to guarantee in practice. In this work, we propose a variational inference method that, in a principled way, can simultaneously scale to large datasets, and robustify the inferred posterior with respect to the existence of outliers in the observed data. Reformulating Bayes theorem via the $β$ -divergence, we posit a robustified pseudo-Bayesian posterior as the target of inference. Moreover,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dionman/beta-cores
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning

MethodsCoresets