Constrained Decoding for Computationally Efficient Named Entity   Recognition Taggers

Brian Lester; Daniel Pressel; Amy Hemmeter; Sagnik Ray Choudhury,; Srinivas Bangalore

arXiv:2010.04362·cs.CL·October 12, 2020

Constrained Decoding for Computationally Efficient Named Entity Recognition Taggers

Brian Lester, Daniel Pressel, Amy Hemmeter, Sagnik Ray Choudhury,, Srinivas Bangalore

PDF

1 Repo

TL;DR

This paper introduces a constrained decoding approach for NER taggers that simplifies training and maintains performance, eliminating the need for a CRF layer by enforcing transition constraints during decoding.

Contribution

The authors propose a novel constrained decoding method for NER that speeds up training and matches CRF-based models without requiring complex span encoding schemes.

Findings

01

Training with constraints is twice as fast as CRF-based models.

02

Constrained decoding achieves similar F1 scores to CRF models.

03

Open source implementations are provided in PyTorch and TensorFlow.

Abstract

Current state-of-the-art models for named entity recognition (NER) are neural models with a conditional random field (CRF) as the final layer. Entities are represented as per-token labels with a special structure in order to decode them into spans. Current work eschews prior knowledge of how the span encoding scheme works and relies on the CRF learning which transitions are illegal and which are not to facilitate global coherence. We find that by constraining the output to suppress illegal transitions we can train a tagger with a cross-entropy loss twice as fast as a CRF with differences in F1 that are statistically insignificant, effectively eliminating the need for a CRF. We analyze the dynamics of tag co-occurrence to explain when these constraints are most effective and provide open source implementations of our tagger in both PyTorch and TensorFlow.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

blester125/constrained-decoding
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConditional Random Field