Adaptive Symmetrization of the KL Divergence

Omri Ben-Dov; Luiz F.O. Chamon

arXiv:2511.11159·cs.LG·May 12, 2026

Adaptive Symmetrization of the KL Divergence

Omri Ben-Dov, Luiz F.O. Chamon

PDF

TL;DR

This paper introduces a non-adversarial method to minimize the symmetric Jeffreys divergence by jointly fitting a main and proxy model, improving stability and accuracy over traditional methods like MLE and GANs.

Contribution

It proposes a novel, practical algorithm that adaptively minimizes the Jeffreys divergence without adversarial training, using a proxy model for tractable optimization.

Findings

01

More stable training compared to GANs

02

Higher accuracy in density estimation tasks

03

Effective in low-data regimes

Abstract

The forward Kullback-Leibler (KL) divergence is a ubiquitous objective for fitting a parameterized distribution to samples due to its tractability and equivalence to maximum likelihood estimation (MLE). Its inherent asymmetry, however, may lead to degenerate solutions that generalize poorly. While the symmetric Jeffreys divergence offers a more balanced alternative, its optimization is challenging due to the presence of a reverse KL term. Generative adversarial networks (GANs) bypass this intractability using a min-max formulation at the cost of introducing new instability issues. This work proposes a non-adversarial approach to minimize the Jeffreys divergence. To do so, it uses a proxy model to tractably approximate the reverse KL divergence of the main model. The main and proxy models are jointly fitted to the data using a constrained optimization formulation to obtain a practical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.