PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding   from Language Models

Torsten Scholak; Nathan Schucher; Dzmitry Bahdanau

arXiv:2109.05093·cs.CL·September 14, 2021

PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models

Torsten Scholak, Nathan Schucher, Dzmitry Bahdanau

PDF

3 Repos 9 Models

TL;DR

PICARD is a method that constrains auto-regressive language model decoding using incremental parsing, significantly improving the validity and performance of models generating structured outputs like SQL.

Contribution

It introduces a novel incremental parsing approach to constrain language model decoding, enhancing validity and state-of-the-art performance in text-to-SQL tasks.

Findings

01

PICARD improves validity of generated SQL code.

02

PICARD achieves state-of-the-art results on Spider and CoSQL.

03

Constrained decoding reduces invalid outputs significantly.

Abstract

Large pre-trained language models for textual data have an unconstrained output space; at each decoding step, they can produce any of 10,000s of sub-word tokens. When fine-tuned to target constrained formal languages like SQL, these models often generate invalid code, rendering it unusable. We propose PICARD (code and trained models available at https://github.com/ElementAI/picard), a method for constraining auto-regressive decoders of language models through incremental parsing. PICARD helps to find valid output sequences by rejecting inadmissible tokens at each decoding step. On the challenging Spider and CoSQL text-to-SQL translation tasks, we show that PICARD transforms fine-tuned T5 models with passable performance into state-of-the-art solutions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsGated Linear Unit · Attention Is All You Need · Linear Layer · Parsing Incrementally for Constrained Auto-Regressive Decoding · Inverse Square Root Schedule · Softmax · Byte Pair Encoding · Attention Dropout · SentencePiece · Dropout