Gradient-based Inference for Networks with Output Constraints

Jay Yoon Lee; Sanket Vaibhav Mehta; Michael Wick; Jean-Baptiste; Tristan; Jaime Carbonell

arXiv:1707.08608·cs.CL·April 23, 2019

Gradient-based Inference for Networks with Output Constraints

Jay Yoon Lee, Sanket Vaibhav Mehta, Michael Wick, Jean-Baptiste, Tristan, Jaime Carbonell

PDF

TL;DR

This paper introduces a gradient-based inference method that enforces deterministic output constraints in neural networks, improving accuracy on structured prediction tasks without complex post-processing.

Contribution

The paper proposes Gradient-based Inference (GBI), a novel approach that enforces output constraints during inference by nudging model weights, eliminating the need for rule-based post-processing.

Findings

01

GBI enforces constraints effectively across multiple tasks.

02

GBI improves accuracy even for state-of-the-art models.

03

The method reduces reliance on discrete search or post-processing.

Abstract

Practitioners apply neural networks to increasingly complex problems in natural language processing, such as syntactic parsing and semantic role labeling that have rich output structures. Many such structured-prediction problems require deterministic constraints on the output values; for example, in sequence-to-sequence syntactic parsing, we require that the sequential outputs encode valid trees. While hidden units might capture such properties, the network is not always able to learn such constraints from the training data alone, and practitioners must then resort to post-processing. In this paper, we present an inference method for neural networks that enforces deterministic constraints on outputs without performing rule-based post-processing or expensive discrete search. Instead, in the spirit of gradient-based training, we enforce constraints with gradient-based inference (GBI): for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.