Convergence for Discrete Parameter Updates
By: Paul Wilson, Fabio Zanasi, George Constantinides
Potential Business Impact:
Makes computer learning faster and use less power.
Modern deep learning models require immense computational resources, motivating research into low-precision training. Quantised training addresses this by representing training components in low-bit integers, but typically relies on discretising real-valued updates. We introduce an alternative approach where the update rule itself is discrete, avoiding the quantisation of continuous updates by design. We establish convergence guarantees for a general class of such discrete schemes, and present a multinomial update rule as a concrete example, supported by empirical evaluation. This perspective opens new avenues for efficient training, particularly for models with inherently discrete structure.
Similar Papers
Approximate Bayesian Inference via Bitstring Representations
Machine Learning (CS)
Teaches computers to learn from less data.
A Convergence Analysis of Adaptive Optimizers under Floating-point Quantization
Machine Learning (CS)
Makes computers learn faster with less memory.
Learning Quantized Continuous Controllers for Integer Hardware
Machine Learning (CS)
Makes robots move faster using less power.