Score: 0

PARIS: Pruning Algorithm via the Representer theorem for Imbalanced Scenarios

Published: December 7, 2025 | arXiv ID: 2512.06950v1

By: Enrico Camporeale

Potential Business Impact:

Cleans up messy data to predict rare events better.

Business Areas:

Predictive Analytics Artificial Intelligence, Data and Analytics, Software

The challenge of \textbf{imbalanced regression} arises when standard Empirical Risk Minimization (ERM) biases models toward high-frequency regions of the data distribution, causing severe degradation on rare but high-impact ``tail'' events. Existing strategies uch as loss re-weighting or synthetic over-sampling often introduce noise, distort the underlying distribution, or add substantial algorithmic complexity. We introduce \textbf{PARIS} (Pruning Algorithm via the Representer theorem for Imbalanced Scenarios), a principled framework that mitigates imbalance by \emph{optimizing the training set itself}. PARIS leverages the representer theorem for neural networks to compute a \textbf{closed-form representer deletion residual}, which quantifies the exact change in validation loss caused by removing a single training point \emph{without retraining}. Combined with an efficient Cholesky rank-one downdating scheme, PARIS performs fast, iterative pruning that eliminates uninformative or performance-degrading samples. We use a real-world space weather example, where PARIS reduces the training set by up to 75\% while preserving or improving overall RMSE, outperforming re-weighting, synthetic oversampling, and boosting baselines. Our results demonstrate that representer-guided dataset pruning is a powerful, interpretable, and computationally efficient approach to rare-event regression.

Think Before You Prune: Self-Reflective Structured Pruning for Reasoning Language Models

Computation and Language

Makes smart AI smaller without losing its thinking.

1 Dec 2025 1

85%

Pruning Deep Neural Networks via a Combination of the Marchenko-Pastur Distribution and Regularization

Machine Learning (CS)

Makes computer vision models smaller, faster.

2 Mar 2025 2

85%

Non-Parametric Probabilistic Robustness: A Conservative Metric with Optimized Perturbation Distributions

CV and Pattern Recognition

Makes AI more trustworthy with unknown errors.

21 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇬🇧 United Kingdom

Page Count

14 pages

PARIS: Pruning Algorithm via the Representer theorem for Imbalanced Scenarios

Cleans up messy data to predict rare events better.

Technical Abstract

Think Before You Prune: Self-Reflective Structured Pruning for Reasoning Language Models

Pruning Deep Neural Networks via a Combination of the Marchenko-Pastur Distribution and Regularization

Non-Parametric Probabilistic Robustness: A Conservative Metric with Optimized Perturbation Distributions