End-to-End Learning for Structured Prediction Energy Networks

David Belanger; Bishan Yang; Andrew McCallum

arXiv:1703.05667·stat.ML·July 18, 2017·31 cites

End-to-End Learning for Structured Prediction Energy Networks

David Belanger, Bishan Yang, Andrew McCallum

PDF

Open Access

TL;DR

This paper introduces end-to-end training for Structured Prediction Energy Networks (SPENs), enabling more accurate structured predictions by backpropagating through gradient-based optimization, and demonstrates improvements on image denoising and semantic role labeling tasks.

Contribution

It presents a novel end-to-end learning approach for SPENs, allowing the use of complex non-convex energies and improving prediction accuracy over previous methods.

Findings

01

End-to-end trained SPENs outperform structured SVMs.

02

Inexact minimization of non-convex energies yields better results.

03

The method improves speed, accuracy, and memory efficiency.

Abstract

Structured Prediction Energy Networks (SPENs) are a simple, yet expressive family of structured prediction models (Belanger and McCallum, 2016). An energy function over candidate structured outputs is given by a deep network, and predictions are formed by gradient-based optimization. This paper presents end-to-end learning for SPENs, where the energy function is discriminatively trained by back-propagating through gradient-based prediction. In our experience, the approach is substantially more accurate than the structured SVM method of Belanger and McCallum (2016), as it allows us to use more sophisticated non-convex energies. We provide a collection of techniques for improving the speed, accuracy, and memory requirements of end-to-end SPENs, and demonstrate the power of our method on 7-Scenes image denoising and CoNLL-2005 semantic role labeling tasks. In both, inexact minimization of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Topic Modeling