Score: 2

Stable Coresets via Posterior Sampling: Aligning Induced and Full Loss Landscapes

Published: November 21, 2025 | arXiv ID: 2511.17399v1

By: Wei-Kai Chang, Rajiv Khanna

Potential Business Impact:

Trains computers faster and better with less data.

Business Areas:

A/B Testing Data and Analytics

As deep learning models continue to scale, the growing computational demands have amplified the need for effective coreset selection techniques. Coreset selection aims to accelerate training by identifying small, representative subsets of data that approximate the performance of the full dataset. Among various approaches, gradient based methods stand out due to their strong theoretical underpinnings and practical benefits, particularly under limited data budgets. However, these methods face challenges such as naive stochastic gradient descent (SGD) acting as a surprisingly strong baseline and the breakdown of representativeness due to loss curvature mismatches over time. In this work, we propose a novel framework that addresses these limitations. First, we establish a connection between posterior sampling and loss landscapes, enabling robust coreset selection even in high data corruption scenarios. Second, we introduce a smoothed loss function based on posterior sampling onto the model weights, enhancing stability and generalization while maintaining computational efficiency. We also present a novel convergence analysis for our sampling-based coreset selection method. Finally, through extensive experiments, we demonstrate how our approach achieves faster training and enhanced generalization across diverse datasets than the current state of the art.

Explore and Establish Synergistic Effects Between Weight Pruning and Coreset Selection in Neural Network Training

Machine Learning (CS)

Makes AI smarter and faster by cleaning data.

13 Nov 2025 0

88%

PDAC: Efficient Coreset Selection for Continual Learning via Probability Density Awareness

Machine Learning (CS)

Makes computer learning remember better, faster.

12 Nov 2025 0

88%

The Easy Path to Robustness: Coreset Selection using Sample Hardness

Machine Learning (CS)

Makes AI smarter and safer from tricks.

13 Oct 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Repos / Data Links

github.com

Page Count

38 pages

Stable Coresets via Posterior Sampling: Aligning Induced and Full Loss Landscapes

Trains computers faster and better with less data.

Technical Abstract

Explore and Establish Synergistic Effects Between Weight Pruning and Coreset Selection in Neural Network Training

PDAC: Efficient Coreset Selection for Continual Learning via Probability Density Awareness

The Easy Path to Robustness: Coreset Selection using Sample Hardness