Perturbation-efficient Zeroth-order Optimization for Hardware-friendly On-device Training
By: Qitao Tan , Sung-En Chang , Rui Xia and more
Potential Business Impact:
Makes AI learn faster on small devices.
Zeroth-order (ZO) optimization is an emerging deep neural network (DNN) training paradigm that offers computational simplicity and memory savings. However, this seemingly promising approach faces a significant and long-ignored challenge. ZO requires generating a substantial number of Gaussian random numbers, which poses significant difficulties and even makes it infeasible for hardware platforms, such as FPGAs and ASICs. In this paper, we identify this critical issue, which arises from the mismatch between algorithm and hardware designers. To address this issue, we proposed PeZO, a perturbation-efficient ZO framework. Specifically, we design random number reuse strategies to significantly reduce the demand for random number generation and introduce a hardware-friendly adaptive scaling method to replace the costly Gaussian distribution with a uniform distribution. Our experiments show that PeZO reduces the required LUTs and FFs for random number generation by 48.6\% and 12.7\%, and saves at maximum 86\% power consumption, all without compromising training performance, making ZO optimization feasible for on-device training. To the best of our knowledge, we are the first to explore the potential of on-device ZO optimization, providing valuable insights for future research.
Similar Papers
ElasticZO: A Memory-Efficient On-Device Learning with Combined Zeroth- and First-Order Optimization
Machine Learning (CS)
Trains AI on phones without sending data.
FZOO: Fast Zeroth-Order Optimizer for Fine-Tuning Large Language Models towards Adam-Scale Speed
Machine Learning (CS)
Makes AI learn faster using less computer memory.
Elucidating Subspace Perturbation in Zeroth-Order Optimization: Theory and Practice at Scale
Machine Learning (CS)
Makes AI learn faster by changing fewer things.