Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions
Mathias Niepert, Pasquale Minervini, Luca Franceschi

TL;DR
This paper introduces Implicit MLE, a versatile framework for end-to-end learning with discrete exponential family distributions and neural networks, avoiding the need for smooth relaxations and enabling differentiation through combinatorial solvers.
Contribution
The paper presents I-MLE, a novel approach that unifies various differentiation methods for discrete distributions and combinatorial problems, including a new class of noise distributions for marginals.
Findings
I-MLE often outperforms existing relaxation-based methods.
It simplifies to maximum likelihood in certain combinatorial learning settings.
The framework is broadly applicable to models combining discrete distributions and neural components.
Abstract
Combining discrete probability distributions and combinatorial optimization problems with neural network components has numerous applications but poses several challenges. We propose Implicit Maximum Likelihood Estimation (I-MLE), a framework for end-to-end learning of models combining discrete exponential family distributions and differentiable neural components. I-MLE is widely applicable as it only requires the ability to compute the most probable states and does not rely on smooth relaxations. The framework encompasses several approaches such as perturbation-based implicit differentiation and recent methods to differentiate through black-box combinatorial solvers. We introduce a novel class of noise distributions for approximating marginals via perturb-and-MAP. Moreover, we show that I-MLE simplifies to maximum likelihood estimation when used in some recently studied learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Algorithms · Gaussian Processes and Bayesian Inference · Neural Networks and Applications
