Zeroth-order Stochastic Cubic Newton Method Revisited
Yu Liu, Weibin Peng, Tianyu Wang, Jiajia Yu

TL;DR
This paper introduces a novel stochastic zeroth-order cubic Newton method that exploits low-rank Hessian structures for efficient optimization, achieving improved sample complexity and validated through experiments.
Contribution
It proposes a new Hessian estimator that leverages low-rank structures without incoherence assumptions, improving sample complexity in zeroth-order stochastic optimization.
Findings
Achieves second-order stationarity with fewer function evaluations.
Outperforms existing methods in high-dimensional settings.
Validated on matrix recovery and machine learning tasks.
Abstract
This paper studies stochastic minimization of a finite-sum loss . In many real-world scenarios, the Hessian matrix of such objectives exhibits a low-rank structure on a batch of data. At the same time, zeroth-order optimization has gained prominence in important applications such as fine-tuning large language models. Drawing on these observations, we propose a novel stochastic zeroth-order cubic Newton method that leverages the low-rank Hessian structure via a matrix recovery-based estimation technique. Our method circumvents restrictive incoherence assumptions, enabling accurate Hessian approximation through finite-difference queries. Theoretically, we establish that for most real-world problems in , $\mathcal{O}\left(\frac{n}{\eta^{\frac{7}{2}}}\right)+\widetilde{\mathcal{O}}\left(\frac{n^2…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Advanced Image Processing Techniques · Advanced Optimization Algorithms Research
