Stable Coresets via Posterior Sampling: Aligning Induced and Full Loss Landscapes

Wei-Kai Chang; Rajiv Khanna

arXiv:2511.17399·cs.LG·November 24, 2025

Stable Coresets via Posterior Sampling: Aligning Induced and Full Loss Landscapes

Wei-Kai Chang, Rajiv Khanna

PDF

Open Access 1 Video

TL;DR

This paper introduces a new coreset selection framework using posterior sampling that improves training speed and generalization in deep learning models by addressing loss landscape challenges.

Contribution

It establishes a connection between posterior sampling and loss landscapes, proposing a smoothed loss function for more stable and effective coreset selection.

Findings

01

Achieves faster training compared to existing methods

02

Enhances model generalization across multiple datasets

03

Provides a novel convergence analysis for sampling-based coreset selection

Abstract

As deep learning models continue to scale, the growing computational demands have amplified the need for effective coreset selection techniques. Coreset selection aims to accelerate training by identifying small, representative subsets of data that approximate the performance of the full dataset. Among various approaches, gradient based methods stand out due to their strong theoretical underpinnings and practical benefits, particularly under limited data budgets. However, these methods face challenges such as naive stochastic gradient descent (SGD) acting as a surprisingly strong baseline and the breakdown of representativeness due to loss curvature mismatches over time. In this work, we propose a novel framework that addresses these limitations. First, we establish a connection between posterior sampling and loss landscapes, enabling robust coreset selection even in high data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Stable Coresets via Posterior Sampling: Aligning Induced and Full Loss Landscapes· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning