Neural Estimation of Statistical Divergences

Sreejith Sreekumar; Ziv Goldfeld

arXiv:2110.03652·math.ST·March 30, 2022·J. Mach. Learn. Res.

Neural Estimation of Statistical Divergences

Sreejith Sreekumar, Ziv Goldfeld

PDF

Open Access

TL;DR

This paper provides non-asymptotic error bounds for neural estimators of key statistical divergences, demonstrating their consistency and optimality under certain conditions, thus advancing theoretical understanding of neural divergence estimation.

Contribution

The paper establishes the first non-asymptotic error bounds for shallow neural estimators of multiple f-divergences, linking error to network size and sample size.

Findings

01

Bounds characterize effective error in terms of NN size and samples.

02

Neural estimators achieve minimax optimal rates for certain divergences.

03

Results ensure consistency and parametric convergence rates under conditions.

Abstract

Statistical divergences (SDs), which quantify the dissimilarity between probability distributions, are a basic constituent of statistical inference and machine learning. A modern method for estimating those divergences relies on parametrizing an empirical variational form by a neural network (NN) and optimizing over parameter space. Such neural estimators are abundantly used in practice, but corresponding performance guarantees are partial and call for further exploration. We establish non-asymptotic absolute error bounds for a neural estimator realized by a shallow NN, focusing on four popular $f$ -divergences -- Kullback-Leibler, chi-squared, squared Hellinger, and total variation. Our analysis relies on non-asymptotic function approximation theorems and tools from empirical process theory to bound the two sources of error involved: function approximation and empirical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Adversarial Robustness in Machine Learning