# A Sketch-and-Project Analysis of Subsampled Natural Gradient Algorithms

**Authors:** Gil Goldshlager, Jiang Hu, and Lin Lin

arXiv: 2508.21022 · 2026-02-06

## TL;DR

This paper analyzes subsampled natural gradient algorithms through a sketch-and-project perspective, providing convergence guarantees and insights into their advantages over SGD in small-sample settings.

## Contribution

It introduces a new proxy based on squared volume sampling, leading to global convergence results and a better understanding of SNG's benefits over SGD.

## Key findings

- Expectation of SNG direction equals a preconditioned gradient step.
- Global convergence guaranteed with a single mini-batch of any size.
- SNG exploits spectral decay more effectively than SGD.

## Abstract

Subsampled natural gradient descent (SNG) has been used to enable high-precision scientific machine learning, but standard analyses based on stochastic preconditioning fail to provide insight into realistic small-sample settings. We overcome this limitation by instead analyzing SNG as a sketch-and-project method. Motivated by this lens, we discard the usual theoretical proxy which decouples gradients and preconditioners using two independent mini-batches, and we replace it with a new proxy based on squared volume sampling. Under this new proxy we show that the expectation of the SNG direction becomes equal to a preconditioned gradient descent step even in the presence of coupling, leading to (i) global convergence guarantees when using a single mini-batch of any size, and (ii) an explicit characterization of the convergence rate in terms of quantities related to the sketch-and-project structure. These findings in turn yield new insights into small-sample settings, for example by suggesting that the advantage of SNG over SGD is that it can more effectively exploit spectral decay in the model Jacobian. We also extend these ideas to explain a popular structured momentum scheme for SNG, known as SPRING, by showing that it arises naturally from accelerated sketch-and-project methods.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.21022/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/2508.21022/full.md

## References

62 references — full list in the complete paper: https://tomesphere.com/paper/2508.21022/full.md

---
Source: https://tomesphere.com/paper/2508.21022