Score: 0

Perturbation-efficient Zeroth-order Optimization for Hardware-friendly On-device Training

Published: April 28, 2025 | arXiv ID: 2504.20314v2

By: Qitao Tan , Sung-En Chang , Rui Xia and more

Potential Business Impact:

Makes AI learn faster on small devices.

Business Areas:

A/B Testing Data and Analytics

Zeroth-order (ZO) optimization is an emerging deep neural network (DNN) training paradigm that offers computational simplicity and memory savings. However, this seemingly promising approach faces a significant and long-ignored challenge. ZO requires generating a substantial number of Gaussian random numbers, which poses significant difficulties and even makes it infeasible for hardware platforms, such as FPGAs and ASICs. In this paper, we identify this critical issue, which arises from the mismatch between algorithm and hardware designers. To address this issue, we proposed PeZO, a perturbation-efficient ZO framework. Specifically, we design random number reuse strategies to significantly reduce the demand for random number generation and introduce a hardware-friendly adaptive scaling method to replace the costly Gaussian distribution with a uniform distribution. Our experiments show that PeZO reduces the required LUTs and FFs for random number generation by 48.6\% and 12.7\%, and saves at maximum 86\% power consumption, all without compromising training performance, making ZO optimization feasible for on-device training. To the best of our knowledge, we are the first to explore the potential of on-device ZO optimization, providing valuable insights for future research.

ElasticZO: A Memory-Efficient On-Device Learning with Combined Zeroth- and First-Order Optimization

Machine Learning (CS)

Trains AI on phones without sending data.

8 Jan 2025 0

89%

FZOO: Fast Zeroth-Order Optimizer for Fine-Tuning Large Language Models towards Adam-Scale Speed

Machine Learning (CS)

Makes AI learn faster using less computer memory.

10 Jun 2025 0

89%

Elucidating Subspace Perturbation in Zeroth-Order Optimization: Theory and Practice at Scale

Machine Learning (CS)

Makes AI learn faster by changing fewer things.

31 Jan 2025 2

View PDF Login to Bookmark

Page Count

9 pages

Perturbation-efficient Zeroth-order Optimization for Hardware-friendly On-device Training

Makes AI learn faster on small devices.

Technical Abstract

ElasticZO: A Memory-Efficient On-Device Learning with Combined Zeroth- and First-Order Optimization

FZOO: Fast Zeroth-Order Optimizer for Fine-Tuning Large Language Models towards Adam-Scale Speed

Elucidating Subspace Perturbation in Zeroth-Order Optimization: Theory and Practice at Scale