Score: 0

Towards Fast LLM Fine-tuning through Zeroth-Order Optimization with Projected Gradient-Aligned Perturbations

Published: October 21, 2025 | arXiv ID: 2510.18228v1

By: Zhendong Mi , Qitao Tan , Grace Li Zhang and more

Potential Business Impact:

Teaches computers new skills faster with less power.

Business Areas:

A/B Testing Data and Analytics

Fine-tuning large language models (LLMs) using zeroth-order (ZO) optimization has emerged as a promising alternative to traditional gradient-based methods due to its reduced memory footprint requirement. However, existing ZO methods suffer from high variance in gradient estimation, leading to slow convergence and suboptimal performance on large-scale models. In this work, we propose P-GAP, a fast LLM fine-tuning approach through zeroth-order optimization with Projected Gradient-Aligned Perturbations. Specifically, we first estimate a low-dimensional gradient space and then align perturbations in projected gradients' direction within the space. This approach enables reduced the number of perturbed parameters and decreased variance, therefore accelerated convergence for LLM fine-tuning. Experiments on LLMs show that P-GAP consistently surpasses the baselines, achieving up to 6% increase in accuracy on classification tasks and up to 12% higher accuracy on generation tasks, with up to about 81% less training iterations and 70% less GPU hours. These results demonstrate that P-GAP enables fast, scalable, and resource-efficient ZO LLM fine-tuning.

ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory

Machine Learning (CS)

Lets huge AI models train on small computers.

16 Mar 2025 1

88%

TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs

Machine Learning (CS)

Makes AI learn faster with less computer power.

31 Jan 2025 1

88%

Elucidating Subspace Perturbation in Zeroth-Order Optimization: Theory and Practice at Scale

Machine Learning (CS)

Makes AI learn faster by changing fewer things.

31 Jan 2025 2

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

23 pages

Towards Fast LLM Fine-tuning through Zeroth-Order Optimization with Projected Gradient-Aligned Perturbations

Teaches computers new skills faster with less power.

Technical Abstract

ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory

TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs

Elucidating Subspace Perturbation in Zeroth-Order Optimization: Theory and Practice at Scale