Variational Bayesian Pseudo-Coreset

Hyungi Lee; Seungyoo Lee; Juho Lee

arXiv:2502.21143·cs.LG·March 3, 2025

Variational Bayesian Pseudo-Coreset

Hyungi Lee, Seungyoo Lee, Juho Lee

PDF

3 Reviews

TL;DR

This paper introduces Variational Bayesian Pseudo-Coreset (VBPC), a novel method that employs variational inference to create small, efficient datasets for Bayesian neural networks, reducing computational costs and improving performance.

Contribution

The paper proposes VBPC, a new approach that enhances Bayesian pseudo-coresets with variational inference, addressing memory inefficiency and sub-optimal results of prior methods.

Findings

01

VBPC reduces memory usage during training.

02

VBPC improves predictive performance on benchmark datasets.

03

VBPC decreases computational costs compared to previous methods.

Abstract

The success of deep learning requires large datasets and extensive training, which can create significant computational challenges. To address these challenges, pseudo-coresets, small learnable datasets that mimic the entire data, have been proposed. Bayesian Neural Networks, which offer predictive uncertainty and probabilistic interpretation for deep neural networks, also face issues with large-scale datasets due to their high-dimensional parameter space. Prior works on Bayesian Pseudo-Coresets (BPC) attempt to reduce the computational load for computing weight posterior distribution by a small number of pseudo-coresets but suffer from memory inefficiency during BPC training and sub-optimal results. To overcome these limitations, we propose Variational Bayesian Pseudo-Coreset (VBPC), a novel approach that utilizes variational inference to efficiently approximate the posterior…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 8Confidence 2

Strengths

The paper effectively utilizes variational inference to derive a closed-form posterior distribution for the weights of the last layer, thereby addressing some of the performance limitations observed in prior BPC approaches. VBPC’s capability to approximate the predictive distribution in a single forward pass enhances both computational and memory efficiency, positioning it as a potentially valuable method for large-scale applications.

Weaknesses

The experimental validation on practical application is limited.

Reviewer 02Rating 6Confidence 3

Strengths

- This paper leverages the variational formulation to obtain the closed-form posterior distribution of the last layer weights, which resolves the issue of suboptimal performance seen in previous approaches. - And, the method approximates the predictive distribution with only a single forward pass instead of multiple forwards, making the approach computationally and memory-efficient.

Weaknesses

- The experiment is not enough to illustrate the function of the algorithm.

Reviewer 03Rating 8Confidence 3

Strengths

- The paper is particularly well-written with a clearly defined problem scope and a solid solution methodology that follows a well-justified sequence of development steps, - The proposed bilevel variational inference formulation is neat and sensible. - The computational complexity analysis is indeed helpful to see the merit of the devised solution. - The reported results are strong on the chosen group of data sets.

Weaknesses

- The paper motivates the core set extraction problem with use cases such as processing big data and addressing continual learning setups. However, the presented results are on data sets that can be considered in the present technological landscape as toy problems. I do symphathize the idea of prototyping. But given the strong applied component of the present work, I am still unconvinced about the generalizability of the observed scores to a case where coreset extraction is an actual necessity.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.