Convergence guarantees for forward gradient descent in the linear regression model
Thijs Bos, Johannes Schmidt-Hieber

TL;DR
This paper provides theoretical convergence guarantees for a biologically inspired forward gradient descent method in linear regression, showing it converges with a specific rate depending on the number of parameters and samples.
Contribution
It introduces a convergence analysis for a weight-perturbed forward gradient scheme in linear regression, highlighting its rate and dependence on parameters and samples.
Findings
Convergence occurs when samples k are proportional to d^2 log(d).
The mean squared error decreases at a rate of d^2 log(d)/k.
The method's dimension dependence includes an additional d log(d) factor compared to stochastic gradient descent.
Abstract
Renewed interest in the relationship between artificial and biological neural networks motivates the study of gradient-free methods. Considering the linear regression model with random design, we theoretically analyze in this work the biologically motivated (weight-perturbed) forward gradient scheme that is based on random linear combination of the gradient. If d denotes the number of parameters and k the number of samples, we prove that the mean squared error of this method converges for with rate Compared to the dimension dependence d for stochastic gradient descent, an additional factor occurs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Theoretical and Computational Physics
MethodsForward gradient · Linear Regression
