Score: 1

Beyond Random: Automatic Inner-loop Optimization in Dataset Distillation

Published: October 6, 2025 | arXiv ID: 2510.04838v1

By: Muquan Li , Hang Gou , Dongyang Zhang and more

Potential Business Impact:

Makes AI learn faster and use less memory.

Business Areas:

A/B Testing Data and Analytics

The growing demand for efficient deep learning has positioned dataset distillation as a pivotal technique for compressing training dataset while preserving model performance. However, existing inner-loop optimization methods for dataset distillation typically rely on random truncation strategies, which lack flexibility and often yield suboptimal results. In this work, we observe that neural networks exhibit distinct learning dynamics across different training stages-early, middle, and late-making random truncation ineffective. To address this limitation, we propose Automatic Truncated Backpropagation Through Time (AT-BPTT), a novel framework that dynamically adapts both truncation positions and window sizes according to intrinsic gradient behavior. AT-BPTT introduces three key components: (1) a probabilistic mechanism for stage-aware timestep selection, (2) an adaptive window sizing strategy based on gradient variation, and (3) a low-rank Hessian approximation to reduce computational overhead. Extensive experiments on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet-1K show that AT-BPTT achieves state-of-the-art performance, improving accuracy by an average of 6.16% over baseline methods. Moreover, our approach accelerates inner-loop optimization by 3.9x while saving 63% memory cost.

Empirical Results for Adjusting Truncated Backpropagation Through Time while Training Neural Audio Effects

Machine Learning (CS)

Trains AI to make music sound better, faster.

8 Dec 2025 1

87%

DDTime: Dataset Distillation with Spectral Alignment and Information Bottleneck for Time-Series Forecasting

Machine Learning (CS)

Makes computer predictions faster with less data.

20 Nov 2025 0

86%

Rethinking Long-tailed Dataset Distillation: A Uni-Level Framework with Unbiased Recovery and Relabeling

CV and Pattern Recognition

Teaches computers to learn better from messy data.

24 Nov 2025 1

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

28 pages

Beyond Random: Automatic Inner-loop Optimization in Dataset Distillation

Makes AI learn faster and use less memory.

Technical Abstract

Empirical Results for Adjusting Truncated Backpropagation Through Time while Training Neural Audio Effects

DDTime: Dataset Distillation with Spectral Alignment and Information Bottleneck for Time-Series Forecasting

Rethinking Long-tailed Dataset Distillation: A Uni-Level Framework with Unbiased Recovery and Relabeling