Score: 1

Towards Scalable Backpropagation-Free Gradient Estimation

Published: November 5, 2025 | arXiv ID: 2511.03110v1

By: Daniel Wang, Evan Markou, Dylan Campbell

Potential Business Impact:

Makes AI learn faster with less memory.

Business Areas:
A/B Testing Data and Analytics

While backpropagation--reverse-mode automatic differentiation--has been extraordinarily successful in deep learning, it requires two passes (forward and backward) through the neural network and the storage of intermediate activations. Existing gradient estimation methods that instead use forward-mode automatic differentiation struggle to scale beyond small networks due to the high variance of the estimates. Efforts to mitigate this have so far introduced significant bias to the estimates, reducing their utility. We introduce a gradient estimation approach that reduces both bias and variance by manipulating upstream Jacobian matrices when computing guess directions. It shows promising results and has the potential to scale to larger networks, indeed performing better as the network width is increased. Our understanding of this method is facilitated by analyses of bias and variance, and their connection to the low-dimensional structure of neural network gradients.

Country of Origin
🇦🇺 Australia

Page Count
12 pages

Category
Computer Science:
Machine Learning (CS)