History-Aware Adaptive High-Order Tensor Regularization
Chang He, Bo Jiang, Yuntian Jiang, Chuwen Zhang, Shuzhong Zhang

TL;DR
This paper introduces a history-aware adaptive regularization method for optimizing composite functions, achieving optimal complexity guarantees without prior knowledge of Lipschitz constants, and demonstrates its effectiveness through numerical experiments.
Contribution
The paper proposes a novel adaptive regularization technique that uses historical Lipschitz estimates, matching standard tensor method complexities without requiring known Lipschitz constants.
Findings
Method matches complexity guarantees of standard tensor methods.
Achieves $ ilde{O}( ext{accuracy}^{-1/p})$ iteration complexity for convex functions.
Numerical experiments confirm the effectiveness of the adaptive approach.
Abstract
In this paper, we develop a new adaptive regularization method for minimizing a composite function, which is the sum of a th-order () Lipschitz continuous function and a simple, convex, and possibly nonsmooth function. We use a history of local Lipschitz estimates to adaptively select the current regularization parameter, an approach we shall term the {\it history-aware adaptive regularization method}. We explore how the selection of an appropriate volume of historical information affects both the theoretical and practical performance. By using all the historical information, our method matches the complexity guarantees of the standard th-order tensor methods that require a known Lipschitz constant, for both convex and nonconvex objectives. In the nonconvex case, the number of iterations required to find an -approximate second-order stationary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Tensor decomposition and applications
