Stochastic Analysis of an Adaptive Cubic Regularisation Method under Inexact Gradient Evaluations and Dynamic Hessian Accuracy
Stefania Bellavia, Gianmarco Gurioli

TL;DR
This paper extends an adaptive cubic regularisation method to stochastic nonconvex optimization, allowing inexact gradient and Hessian computations, and proves it maintains optimal iteration complexity with practical numerical benefits.
Contribution
It introduces a stochastic variant of an adaptive cubic regularisation method with inexact derivatives, preserving optimal complexity bounds.
Findings
Expected iteration complexity remains O(epsilon^(-3/2)).
Inexact derivatives can reduce computational costs.
Numerical tests confirm practical efficiency gains.
Abstract
We here adapt an extended version of the adaptive cubic regularisation method with dynamic inexact Hessian information for nonconvex optimisation in [3] to the stochastic optimisation setting. While exact function evaluations are still considered, this novel variant inherits the innovative use of adaptive accuracy requirements for Hessian approximations introduced in [3] and additionally employs inexact computations of the gradient. Without restrictions on the variance of the errors, we assume that these approximations are available within a sufficiently large, but fixed, probability and we extend, in the spirit of [18], the deterministic analysis of the framework to its stochastic counterpart, showing that the expected number of iterations to reach a first-order stationary point matches the well known worst-case optimal complexity. This is, in fact, still given by O(epsilon^(-3/2)),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Numerical methods in inverse problems
