Dual Stochastic Natural Gradient Descent and convergence of interior half-space gradient approximations
Borja S\'anchez-L\'opez, Jesus Cerquides

TL;DR
This paper introduces DNSGD, a new stochastic optimization algorithm for multinomial logistic regression that guarantees convergence and has linear complexity per iteration by leveraging manifold optimization and natural gradient approximations.
Contribution
The paper proposes DNSGD, combining manifold optimization and convergence guarantees, with linear per-iteration complexity for large-scale MLR models.
Findings
DNSGD converges under specified conditions.
Computational complexity per iteration is linear in the number of parameters.
The method is suitable for large-scale multinomial logistic regression.
Abstract
The multinomial logistic regression (MLR) model is widely used in statistics and machine learning. Stochastic gradient descent (SGD) is the most common approach for determining the parameters of a MLR model in big data scenarios. However, SGD has slow sub-linear rates of convergence. A way to improve these rates of convergence is to use manifold optimization. Along this line, stochastic natural gradient descent (SNGD), proposed by Amari, was proven to be Fisher efficient when it converged. However, SNGD is not guaranteed to converge and it is computationally too expensive for MLR models with a large number of parameters. Here, we propose a stochastic optimization method for MLR based on manifold optimization concepts which (i) has per-iteration computational complexity is linear in the number of parameters and (ii) can be proven to converge. To achieve (i) we establish that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Statistical Methods and Inference · Face and Expression Recognition
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Natural Gradient Descent
