Learning Approximate Inference Networks for Structured Prediction

Lifu Tu; Kevin Gimpel

arXiv:1803.03376·cs.CL·March 12, 2018·26 cites

Learning Approximate Inference Networks for Structured Prediction

Lifu Tu, Kevin Gimpel

PDF

Open Access 3 Repos

TL;DR

This paper introduces a neural network-based inference method for structured prediction that replaces gradient descent, achieving significant speed-ups and improved accuracy in various structured prediction tasks.

Contribution

It proposes a neural inference network trained to approximate structured argmax inference, replacing gradient-based methods for faster and more accurate structured prediction.

Findings

01

10-60x speed-up in multi-label classification

02

Comparable speed and accuracy to exact inference in sequence labeling

03

Improved accuracy with label language model for long-distance dependencies

Abstract

Structured prediction energy networks (SPENs; Belanger & McCallum 2016) use neural network architectures to define energy functions that can capture arbitrary dependencies among parts of structured outputs. Prior work used gradient descent for inference, relaxing the structured output to a set of continuous variables and then optimizing the energy with respect to them. We replace this use of gradient descent with a neural network trained to approximate structured argmax inference. This "inference network" outputs continuous values that we treat as the output structure. We develop large-margin training criteria for joint training of the structured energy function and inference network. On multi-label classification we report speed-ups of 10-60x compared to (Belanger et al, 2017) while also improving accuracy. For sequence labeling with simple structured energies, our approach performs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification