Differentiable, Bit-shifting, and Scalable Quantization without training neural network from scratch
By: Zia Badar
Potential Business Impact:
Makes AI smarter and faster using less power.
Quantization of neural networks provides benefits of inference in less compute and memory requirements. Previous work in quantization lack two important aspects which this work provides. First almost all previous work in quantization used a non-differentiable approach and for learning; the derivative is usually set manually in backpropogation which make the learning ability of algorithm questionable, our approach is not just differentiable, we also provide proof of convergence of our approach to the optimal neural network. Second previous work in shift/logrithmic quantization either have avoided activation quantization along with weight quantization or achieved less accuracy. Learning logrithmic quantize values of form $2^n$ requires the quantization function can scale to more than 1 bit quantization which is another benifit of our quantization that it provides $n$ bits quantization as well. Our approach when tested with image classification task using imagenet dataset, resnet18 and weight quantization only achieves less than 1 percent accuracy compared to full precision accuracy while taking only 15 epochs to train using shift bit quantization and achieves comparable to SOTA approaches accuracy in both weight and activation quantization using shift bit quantization in 15 training epochs with slightly higher(only higher cpu instructions) inference cost compared to 1 bit quantization(without logrithmic quantization) and not requiring any higher precision multiplication.
Similar Papers
Regularization-based Framework for Quantization-, Fault- and Variability-Aware Training
Machine Learning (CS)
Makes AI work better on small, cheap devices.
Learning under Quantization for High-Dimensional Linear Regression
Machine Learning (Stat)
Makes computers learn faster with less data.
DQT: Dynamic Quantization Training via Dequantization-Free Nested Integer Arithmetic
Machine Learning (CS)
Makes AI smarter using less computer power.