Learning Energy Networks with Generalized Fenchel-Young Losses
Mathieu Blondel, Felipe Llinares-L\'opez, Robert Dadashi, L\'eonard, Hussenot, Matthieu Geist

TL;DR
This paper introduces generalized Fenchel-Young losses for energy networks, enabling efficient training without argmin/argmax differentiation and demonstrating effectiveness in multilabel classification and imitation learning.
Contribution
It proposes a novel class of losses based on generalized conjugate functions, improving training efficiency for energy-based models.
Findings
Losses have desirable properties and are computationally efficient.
Gradients can be computed without argmin/argmax differentiation.
Effective in multilabel classification and imitation learning tasks.
Abstract
Energy-based models, a.k.a. energy networks, perform inference by optimizing an energy function, typically parametrized by a neural network. This allows one to capture potentially complex relationships between inputs and outputs. To learn the parameters of the energy function, the solution to that optimization problem is typically fed into a loss function. The key challenge for training energy networks lies in computing loss gradients, as this typically requires argmin/argmax differentiation. In this paper, building upon a generalized notion of conjugate function, which replaces the usual bilinear pairing with a general energy function, we propose generalized Fenchel-Young losses, a natural loss construction for learning energy networks. Our losses enjoy many desirable properties and their gradients can be computed efficiently without argmin/argmax differentiation. We also prove the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
