Gradient-Based Non-Linear Inverse Learning
Abhishake, Nicole M\"ucke, and Tapio Helin

TL;DR
This paper analyzes the effectiveness of gradient descent and stochastic gradient descent in solving nonlinear inverse problems under random design, providing convergence rates and optimal stopping criteria within the RKHS framework.
Contribution
It introduces a theoretical analysis of GD and SGD for nonlinear inverse problems, establishing convergence rates and optimal stopping rules under classical assumptions.
Findings
GD and SGD achieve minimax-optimal convergence rates
Convergence analysis based on integral operator and effective dimension
Stopping times ensure optimal recovery in RKHS
Abstract
We study statistical inverse learning in the context of nonlinear inverse problems under random design. Specifically, we address a class of nonlinear problems by employing gradient descent (GD) and stochastic gradient descent (SGD) with mini-batching, both using constant step sizes. Our analysis derives convergence rates for both algorithms under classical a priori assumptions on the smoothness of the target function. These assumptions are expressed in terms of the integral operator associated with the tangent kernel, as well as through a bound on the effective dimension. Additionally, we establish stopping times that yield minimax-optimal convergence rates within the classical reproducing kernel Hilbert space (RKHS) framework. These results demonstrate the efficacy of GD and SGD in achieving optimal rates for nonlinear inverse problems in random design.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Machine Learning and ELM
MethodsStochastic Gradient Descent
