Bethe Learning of Conditional Random Fields via MAP Decoding
Kui Tang, Nicholas Ruozzi, David Belanger, Tony Jebara

TL;DR
This paper introduces MLE-Struct, an efficient algorithm for learning probabilistic structured models using Bethe approximation, connecting MAP decoding with maximum likelihood estimation, and demonstrating superior performance in vision and assignment tasks.
Contribution
It presents a novel single-loop algorithm that combines Bethe approximation with Frank-Wolfe optimization for efficient maximum likelihood estimation in structured models.
Findings
Outperforms existing methods in image segmentation tasks.
Efficiently handles complex structured prediction problems.
Demonstrates effectiveness on real-world datasets including roommate assignments.
Abstract
Many machine learning tasks can be formulated in terms of predicting structured outputs. In frameworks such as the structured support vector machine (SVM-Struct) and the structured perceptron, discriminative functions are learned by iteratively applying efficient maximum a posteriori (MAP) decoding. However, maximum likelihood estimation (MLE) of probabilistic models over these same structured spaces requires computing partition functions, which is generally intractable. This paper presents a method for learning discrete exponential family models using the Bethe approximation to the MLE. Remarkably, this problem also reduces to iterative (MAP) decoding. This connection emerges by combining the Bethe approximation with a Frank-Wolfe (FW) algorithm on a convex dual objective which circumvents the intractable partition function. The result is a new single loop algorithm MLE-Struct, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Domain Adaptation and Few-Shot Learning · Neural Networks and Applications
