HOME-3: High-Order Momentum Estimator with Third-Power Gradient for Convex and Smooth Nonconvex Optimization

Wei Zhang; Arif Hassan Zidan; Afrar Jahin; Yu Bao; Tianming Liu

arXiv:2505.11748·cs.LG·May 20, 2025

HOME-3: High-Order Momentum Estimator with Third-Power Gradient for Convex and Smooth Nonconvex Optimization

Wei Zhang, Arif Hassan Zidan, Afrar Jahin, Yu Bao, Tianming Liu

PDF

Open Access

TL;DR

This paper introduces high-order momentum using third-power gradients to enhance convergence in convex and nonconvex optimization, supported by theoretical analysis and extensive empirical validation.

Contribution

It presents the novel concept of high-order momentum with third-power gradients, demonstrating improved convergence bounds and empirical performance over traditional methods.

Findings

01

High-order momentum improves convergence bounds.

02

Empirical results show superior performance across tasks.

03

Outperforms conventional low-order momentum methods.

Abstract

Momentum-based gradients are essential for optimizing advanced machine learning models, as they not only accelerate convergence but also advance optimizers to escape stationary points. While most state-of-the-art momentum techniques utilize lower-order gradients, such as the squared first-order gradient, there has been limited exploration of higher-order gradients, particularly those raised to powers greater than two. In this work, we introduce the concept of high-order momentum, where momentum is constructed using higher-power gradients, with a focus on the third-power of the first-order gradient as a representative case. Our research offers both theoretical and empirical support for this approach. Theoretically, we demonstrate that incorporating third-power gradients can improve the convergence bounds of gradient-based optimizers for both convex and smooth nonconvex problems.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Bandit Algorithms Research