Stochastic Trust-Region Methods for Over-parameterized Models
Aike Yang, Hao Wang

TL;DR
This paper introduces a stochastic trust-region framework that removes manual step-size tuning and extends to constrained problems, achieving competitive convergence rates and stable optimization in deep learning tasks.
Contribution
It develops a unified stochastic trust-region method for unconstrained and constrained optimization, with theoretical convergence guarantees and practical stability improvements.
Findings
Achieves $O(rac{1}{ ext{ extit{varepsilon}}^2} ext{log}(1/ ext{ extit{varepsilon}}))$ complexity for unconstrained problems.
Achieves $O(rac{1}{ ext{ extit{varepsilon}}^4} ext{log}(1/ ext{ extit{varepsilon}}))$ complexity for constrained problems.
Demonstrates comparable performance to well-tuned baselines in neural network training, with stable behavior and constraint handling.
Abstract
Under interpolation-type assumptions such as the strong growth condition, stochastic optimization methods can attain convergence rates comparable to full-batch methods, but their performance, particularly for SGD, remains highly sensitive to step-size selection. To address this issue, we propose a unified stochastic trust-region framework that eliminates manual step-size tuning and extends naturally to equality-constrained problems. For unconstrained optimization, we develop a first-order stochastic trust-region algorithm and show that, under the strong growth condition, it achieves an iteration and stochastic first-order oracle complexity of for finding an -stationary point. For equality-constrained problems, we introduce a quadratic-penalty-based stochastic trust-region method with penalty parameter , and establish an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
