More Optimal Fractional-Order Stochastic Gradient Descent for Non-Convex Optimization Problems
Mohammad Partohaghighi, Roummel Marcia, YangQuan Chen

TL;DR
This paper introduces 2SEDFOSGD, a novel adaptive fractional stochastic gradient method that dynamically adjusts the fractional exponent based on model sensitivity, leading to faster and more stable convergence in non-convex optimization.
Contribution
It presents a data-driven approach combining 2SED with FOSGD to adaptively tune the fractional exponent, improving stability and convergence in non-convex problems.
Findings
Faster convergence in Gaussian and $\\alpha$-stable noise scenarios.
More robust parameter estimates compared to baseline methods.
Preserves fractional memory advantages without instability.
Abstract
Fractional-order stochastic gradient descent (FOSGD) leverages fractional exponents to capture long-memory effects in optimization. However, its utility is often limited by the difficulty of tuning and stabilizing these exponents. We propose 2SED Fractional-Order Stochastic Gradient Descent (2SEDFOSGD), which integrates the Two-Scale Effective Dimension (2SED) algorithm with FOSGD to adapt the fractional exponent in a data-driven manner. By tracking model sensitivity and effective dimensionality, 2SEDFOSGD dynamically modulates the exponent to mitigate oscillations and hasten convergence. Theoretically, for onoconvex optimization problems, this approach preserves the advantages of fractional memory without the sluggish or unstable behavior observed in na\"ive fractional SGD. Empirical evaluations in Gaussian and -stable noise scenarios using an autoregressive (AR) model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Optimization and Variational Analysis
MethodsStochastic Gradient Descent
