Non-Asymptotic Performance Guarantees for Neural Estimation of $\mathsf{f}$-Divergences
Sreejith Sreekumar, Zhengxin Zhang, Ziv Goldfeld

TL;DR
This paper provides non-asymptotic error bounds for neural network-based estimators of statistical distances, analyzing the tradeoff between approximation and estimation errors for divergences like KL, chi-squared, and Hellinger.
Contribution
It introduces non-asymptotic performance guarantees for neural estimators of f-divergences, addressing the approximation-estimation tradeoff with theoretical bounds.
Findings
Derived non-asymptotic error bounds for neural divergence estimators.
Validated theoretical results with numerical experiments.
Analyzed the impact of neural network complexity on estimation accuracy.
Abstract
Statistical distances (SDs), which quantify the dissimilarity between probability distributions, are central to machine learning and statistics. A modern method for estimating such distances from data relies on parametrizing a variational form by a neural network (NN) and optimizing it. These estimators are abundantly used in practice, but corresponding performance guarantees are partial and call for further exploration. In particular, there seems to be a fundamental tradeoff between the two sources of error involved: approximation and estimation. While the former needs the NN class to be rich and expressive, the latter relies on controlling complexity. This paper explores this tradeoff by means of non-asymptotic error bounds, focusing on three popular choices of SDs -- Kullback-Leibler divergence, chi-squared divergence, and squared Hellinger distance. Our analysis relies on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications · Generative Adversarial Networks and Image Synthesis
