Score: 1

A probabilistic framework for dynamic quantization

Published: May 15, 2025 | arXiv ID: 2505.10689v1

By: Gabriele Santini, Francesco Paissan, Elisabetta Farella

Potential Business Impact:

Makes AI smarter and faster using less computer power.

Business Areas:
Quantum Computing Science and Engineering

We propose a probabilistic framework for dynamic quantization of neural networks that allows for a computationally efficient input-adaptive rescaling of the quantization parameters. Our framework applies a probabilistic model to the network's pre-activations through a lightweight surrogate, enabling the adaptive adjustment of the quantization parameters on a per-input basis without significant memory overhead. We validate our approach on a set of popular computer vision tasks and models, observing only a negligible loss in performance. Our method strikes the best performance and computational overhead tradeoff compared to standard quantization strategies.

Repos / Data Links

Page Count
9 pages

Category
Computer Science:
Machine Learning (CS)