Piecewise Training for Undirected Models

Charles Sutton; Andrew McCallum

arXiv:1207.1409·cs.LG·July 9, 2012·146 cites

Piecewise Training for Undirected Models

Charles Sutton, Andrew McCallum

PDF

Open Access

TL;DR

This paper introduces a piecewise training method for large undirected models, providing a scalable alternative to intractable maximum likelihood training, with theoretical justification and competitive empirical results.

Contribution

It proposes a novel piecewise training approach justified as minimizing upper bounds on the log partition function, improving training efficiency for undirected models.

Findings

01

Piecewise training outperforms pseudolikelihood in accuracy.

02

It performs comparably to belief propagation-based global training.

03

The method is effective on natural language datasets.

Abstract

For many large undirected models that arise in real-world applications, exact maximumlikelihood training is intractable, because it requires computing marginal distributions of the model. Conditional training is even more difficult, because the partition function depends not only on the parameters, but also on the observed input, requiring repeated inference over each training example. An appealing idea for such models is to independently train a local undirected classifier over each clique, afterwards combining the learned weights into a single global model. In this paper, we show that this piecewise method can be justified as minimizing a new family of upper bounds on the log partition function. On three natural-language data sets, piecewise training is more accurate than pseudolikelihood, and often performs comparably to global training using belief propagation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Algorithms