Supertagging the Long Tail with Tree-Structured Decoding of Complex   Categories

Jakob Prange; Nathan Schneider; Vivek Srikumar

arXiv:2012.01285·cs.CL·December 14, 2020

Supertagging the Long Tail with Tree-Structured Decoding of Complex Categories

Jakob Prange, Nathan Schneider, Vivek Srikumar

PDF

1 Repo

TL;DR

This paper introduces a tree-structured decoding approach for CCG supertagging that effectively captures complex categories' internal structure, improving recognition of rare long-tail tags and generalization to out-of-domain data.

Contribution

It presents novel tree-structured models for supertagging that handle complex and rare categories, outperforming traditional flat models in recognizing long-tail supertags.

Findings

01

Recover a significant portion of long-tail supertags.

02

Generate unseen categories with high accuracy.

03

Maintain competitive overall tag accuracy with fewer parameters.

Abstract

Although current CCG supertaggers achieve high accuracy on the standard WSJ test set, few systems make use of the categories' internal structure that will drive the syntactic derivation during parsing. The tagset is traditionally truncated, discarding the many rare and complex category types in the long tail. However, supertags are themselves trees. Rather than give up on rare tags, we investigate constructive models that account for their internal structure, including novel methods for tree-structured prediction. Our best tagger is capable of recovering a sizeable fraction of the long-tail supertags and even generates CCG categories that have never been seen in training, while approximating the prior state of the art in overall tag accuracy with fewer parameters. We further investigate how well different approaches generalize to out-of-domain evaluation sets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jakpra/treeconstructive-supertagging
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.