Decentralized Riemannian natural gradient methods with Kronecker-product approximations
Jiang Hu, Kangkang Deng, Na Li, Quanzheng Li

TL;DR
This paper introduces a novel decentralized Riemannian natural gradient method that efficiently approximates the Fisher information matrix using Kronecker products, enabling scalable optimization on manifolds.
Contribution
It proposes the first Riemannian second-order method for decentralized manifold optimization, utilizing Kronecker-product approximations for efficient communication and convergence.
Findings
Converges to a stationary point at rate O(1/K)
Outperforms state-of-the-art methods in numerical experiments
Efficiently approximates RFIM using low-dimensional Kronecker factors
Abstract
With a computationally efficient approximation of the second-order information, natural gradient methods have been successful in solving large-scale structured optimization problems. We study the natural gradient methods for the large-scale decentralized optimization problems on Riemannian manifolds, where the local objective function defined by the local dataset is of a log-probability type. By utilizing the structure of the Riemannian Fisher information matrix (RFIM), we present an efficient decentralized Riemannian natural gradient descent (DRNGD) method. To overcome the communication issue of the high-dimension RFIM, we consider a class of structured problems for which the RFIM can be approximated by a Kronecker product of two low-dimension matrices. By performing the communications over the Kronecker factors, a high-quality approximation of the RFIM can be obtained in a low cost.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Face and Expression Recognition · Sparse and Compressive Sensing Techniques
MethodsNatural Gradient Descent
