Learning as Search Optimization: Approximate Large Margin Methods for   Structured Prediction

Hal Daum\'e III; Daniel Marcu

arXiv:0907.0809·cs.LG·July 7, 2009·48 cites

Learning as Search Optimization: Approximate Large Margin Methods for Structured Prediction

Hal Daum\'e III, Daniel Marcu

PDF

Open Access

TL;DR

This paper introduces a framework for structured prediction that uses approximate search for learning and decoding, enabling effective modeling of complex outputs where exact methods are infeasible.

Contribution

It proposes a novel learning as search optimization framework with convergence guarantees, addressing the challenge of intractable exact search in complex structured prediction tasks.

Findings

01

Outperforms exact models in empirical tests

02

Reduces computational cost compared to exact methods

03

Provides convergence theorems and bounds for the proposed updates

Abstract

Mappings to structured output spaces (strings, trees, partitions, etc.) are typically learned using extensions of classification algorithms to simple graphical structures (eg., linear chains) in which search and parameter estimation can be performed exactly. Unfortunately, in many complex problems, it is rare that exact search or parameter estimation is tractable. Instead of learning exact models and searching via heuristic means, we embrace this difficulty and treat the structured output problem in terms of approximate search. We present a framework for learning as search optimization, and two parameter updates with convergence theorems and bounds. Empirical evidence shows that our integrated approach to learning and decoding can outperform exact models at smaller computational cost.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Algorithms and Data Compression · Machine Learning and Algorithms