A probabilistic framework for dynamic quantization
By: Gabriele Santini, Francesco Paissan, Elisabetta Farella
Potential Business Impact:
Makes AI smarter and faster using less computer power.
We propose a probabilistic framework for dynamic quantization of neural networks that allows for a computationally efficient input-adaptive rescaling of the quantization parameters. Our framework applies a probabilistic model to the network's pre-activations through a lightweight surrogate, enabling the adaptive adjustment of the quantization parameters on a per-input basis without significant memory overhead. We validate our approach on a set of popular computer vision tasks and models, observing only a negligible loss in performance. Our method strikes the best performance and computational overhead tradeoff compared to standard quantization strategies.
Similar Papers
Regularization-based Framework for Quantization-, Fault- and Variability-Aware Training
Machine Learning (CS)
Makes AI work better on small, cheap devices.
Natural Quantization of Neural Networks
Quantum Physics
Makes computers learn better with quantum tricks.
Optimizing Deep Neural Networks using Safety-Guided Self Compression
Machine Learning (CS)
Shrinks smart computer programs without losing smarts.