Optimization without Backpropagation
Gabriel Belouze

TL;DR
This paper explores forward gradients as an alternative to backpropagation for optimization, deriving optimality conditions and demonstrating their limitations in high-dimensional settings through theoretical insights and experiments.
Contribution
It introduces an optimality condition for forward gradients and analyzes their effectiveness and limitations in high-dimensional optimization tasks.
Findings
Forward gradients can approximate true gradients but face challenges in high dimensions.
Mathematical insights reveal limitations of forward gradients in complex optimization landscapes.
Experiments support the theoretical claim that high-dimensional optimization with forward gradients is difficult.
Abstract
Forward gradients have been recently introduced to bypass backpropagation in autodifferentiation, while retaining unbiased estimators of true gradients. We derive an optimality condition to obtain best approximating forward gradients, which leads us to mathematical insights that suggest optimization in high dimension is challenging with forward gradients. Our extensive experiments on test functions support this claim.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Sparse and Compressive Sensing Techniques
MethodsTest · Forward gradient · Adabelief · Stochastic Gradient Descent · Adam
