Score: 0

Importance Sampling for Nonlinear Models

Published: May 18, 2025 | arXiv ID: 2505.12353v1

By: Prakash Palanivelu Rajmohan, Fred Roosta

Potential Business Impact:

Helps computers learn from big data faster.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

While norm-based and leverage-score-based methods have been extensively studied for identifying "important" data points in linear models, analogous tools for nonlinear models remain significantly underdeveloped. By introducing the concept of the adjoint operator of a nonlinear map, we address this gap and generalize norm-based and leverage-score-based importance sampling to nonlinear settings. We demonstrate that sampling based on these generalized notions of norm and leverage scores provides approximation guarantees for the underlying nonlinear mapping, similar to linear subspace embeddings. As direct applications, these nonlinear scores not only reduce the computational complexity of training nonlinear models by enabling efficient sampling over large datasets but also offer a novel mechanism for model explainability and outlier detection. Our contributions are supported by both theoretical analyses and experimental results across a variety of supervised learning scenarios.